mlflow.types

The mlflow.types module defines data types and utilities to be used by other mlflow components to describe interface independent of other frameworks or languages.

class mlflow.types.ColSpec(type: Union[mlflow.types.schema.Array, mlflow.types.schema.DataType, mlflow.types.schema.Map, mlflow.types.schema.Object, mlflow.types.schema.AnyType, str], name: Optional[str] = None, required: bool = True)[source]

Bases: object

Specification of name and type of a single column in a dataset.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain type and optional name and required keys.

property name: str | None: The column name or None if the columns is unnamed.

property required: bool: Whether this column is required.

property type: mlflow.types.schema.DataType | mlflow.types.schema.Array | mlflow.types.schema.Object | mlflow.types.schema.Map | mlflow.types.schema.AnyType: The column data type.

class mlflow.types.DataType(value)[source]

Bases: enum.Enum

MLflow data types.

binary = 7: Sequence of raw bytes.

boolean = 1: Logical data (True, False) .

datetime = 8: 64b datetime data.

double = 5: 64b floating point numbers.

float = 4: 32b floating point numbers.

integer = 2: 32b signed integer numbers.

long = 3: 64b signed integer numbers.

string = 6: Text data.

to_numpy() → numpy.dtype[source]: Get equivalent numpy data type.

to_pandas() → numpy.dtype[source]: Get equivalent pandas data type.

to_python()[source]: Get equivalent python data type.

class mlflow.types.ParamSchema(params: list[mlflow.types.schema.ParamSpec])[source]

Bases: object

Specification of parameters applicable to the model. ParamSchema is represented as a list of ParamSpec.

classmethod from_json(json_str: str)[source]: Deserialize from a json string.

property params: list[mlflow.types.schema.ParamSpec]: Representation of ParamSchema as a list of ParamSpec.

to_dict() → list[dict[str, typing.Any]][source]: Serialize into a jsonable dictionary.

to_json() → str[source]: Serialize into json string.

class mlflow.types.ParamSpec(name: str, dtype: mlflow.types.schema.DataType | mlflow.types.schema.Object | str, default: Any, shape: Optional[tuple[int, ...]] = None)[source]

Bases: object

Specification used to represent parameters for the model.

class ParamSpecTypedDict[source]: Bases: TypedDict

property default: Any: Default value of the parameter.

property dtype: mlflow.types.schema.DataType | mlflow.types.schema.Object: The parameter data type.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain name, type and default keys.

property name: str: The name of the parameter.

property shape: tuple[int, ...] | None: The parameter shape. If shape is None, the parameter is a scalar.

classmethod validate_type_and_shape(spec: str, value: Any, value_type: mlflow.types.schema.DataType | mlflow.types.schema.Object, shape: tuple[int, ...] | None)[source]: Validate that the value has the expected type and shape.

class mlflow.types.Schema(inputs: list[mlflow.types.schema.ColSpec | mlflow.types.schema.TensorSpec])[source]

Bases: object

Specification of a dataset.

Schema is represented as a list of ColSpec or TensorSpec. A combination of ColSpec and TensorSpec is not allowed.

The dataset represented by a schema can be named, with unique non empty names for every input. In the case of ColSpec, the dataset columns can be unnamed with implicit integer index defined by their list indices. Combination of named and unnamed data inputs are not allowed.

as_spark_schema()[source]: Convert to Spark schema. If this schema is a single unnamed column, it is converted directly the corresponding spark data type, otherwise it’s returned as a struct (missing column names are filled with an integer sequence). Unsupported by TensorSpec.

classmethod from_json(json_str: str)[source]: Deserialize from a json string.

has_input_names() → bool[source]: Return true iff this schema declares names, false otherwise.

input_dict() → dict[str, mlflow.types.schema.ColSpec | mlflow.types.schema.TensorSpec][source]: Maps column names to inputs, iff this schema declares names.

input_names() → list[str | int][source]: Get list of data names or range of indices if the schema has no names.

input_types() → list[mlflow.types.schema.DataType | numpy.dtype | mlflow.types.schema.Array | mlflow.types.schema.Object][source]: Get types for each column in the schema.

input_types_dict() → dict[str, mlflow.types.schema.DataType | numpy.dtype | mlflow.types.schema.Array | mlflow.types.schema.Object][source]: Maps column names to types, iff this schema declares names.

property inputs: list[mlflow.types.schema.ColSpec | mlflow.types.schema.TensorSpec]: Representation of a dataset that defines this schema.

is_tensor_spec() → bool[source]: Return true iff this schema is specified using TensorSpec

numpy_types() → list[numpy.dtype][source]: Convenience shortcut to get the datatypes as numpy types.

optional_input_names() → list[str | int][source]: Get list of optional data names or range of indices if schema has no names.

pandas_types() → list[numpy.dtype][source]: Convenience shortcut to get the datatypes as pandas types. Unsupported by TensorSpec.

required_input_names() → list[str | int][source]: Get list of required data names or range of indices if schema has no names.

to_dict() → list[dict[str, typing.Any]][source]: Serialize into a jsonable dictionary.

to_json() → str[source]: Serialize into json string.

class mlflow.types.TensorSpec(type: numpy.dtype, shape: tuple[int, ...] | list[int], name: Optional[str] = None)[source]

Bases: object

Specification used to represent a dataset stored as a Tensor.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain type and tensor-spec keys.

property name: str | None: The tensor name or None if the tensor is unnamed.

property required: bool: Whether this tensor is required.

property shape: tuple[int, ...]: The tensor shape

property type: numpy.dtype: A unique character code for each of the 21 different numpy built-in types. See https://numpy.org/devdocs/reference/generated/numpy.dtype.html#numpy.dtype for details.

class mlflow.types.responses.ResponsesAgentRequest(*, tool_choice: str | mlflow.types.responses_helpers.ToolChoiceFunction | None = None, truncation: str | None = None, max_output_tokens: int | None = None, metadata: dict[str, str] | None = None, parallel_tool_calls: bool | None = None, tools: list[mlflow.types.responses_helpers.Tool] | None = None, reasoning: mlflow.types.responses_helpers.ReasoningParams | None = None, store: bool | None = None, stream: bool | None = None, temperature: float | None = None, text: Optional[Any] = None, top_p: float | None = None, user: str | None = None, input: list[mlflow.types.responses_helpers.Message | mlflow.types.responses_helpers.OutputItem], custom_inputs: dict[str, typing.Any] | None = None, context: mlflow.types.agent.ChatContext | None = None)[source]

Request object for ResponsesAgent.

Parameters

input – List of simple role and content messages or output items. See examples at https://mlflow.org/docs/latest/genai/flavors/responses-agent-intro#testing-out-your-agent and https://mlflow.org/docs/latest/genai/flavors/responses-agent-intro#creating-agent-output.
custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional context to the model. The dictionary values must be JSON-serializable. Optional defaults to None
context (mlflow.types.agent.ChatContext) – The context to be used in the chat endpoint. Includes conversation_id and user_id. Optional defaults to None

class mlflow.types.responses.ResponsesAgentResponse(*, tool_choice: str | mlflow.types.responses_helpers.ToolChoiceFunction | None = None, truncation: str | None = None, id: str | None = None, created_at: float | None = None, error: mlflow.types.responses_helpers.ResponseError | None = None, incomplete_details: mlflow.types.responses_helpers.IncompleteDetails | None = None, instructions: str | None = None, metadata: dict[str, str] | None = None, model: str | None = None, object: str = 'response', output: list[mlflow.types.responses_helpers.OutputItem], parallel_tool_calls: bool | None = None, temperature: float | None = None, tools: list[mlflow.types.responses_helpers.Tool] | None = None, top_p: float | None = None, max_output_tokens: int | None = None, previous_response_id: str | None = None, reasoning: mlflow.types.responses_helpers.ReasoningParams | None = None, status: str | None = None, text: Optional[Any] = None, usage: mlflow.types.responses_helpers.ResponseUsage | None = None, user: str | None = None, custom_outputs: dict[str, typing.Any] | None = None)[source]

Response object for ResponsesAgent.

Parameters

output – List of output items. See examples at https://mlflow.org/docs/latest/genai/flavors/responses-agent-intro#creating-agent-output.
reasoning – Reasoning parameters
usage – Usage information
custom_outputs (Dict[str, Any]) – An optional param to provide arbitrary additional context from the model. The dictionary values must be JSON-serializable. Optional, defaults to None

class mlflow.types.responses.ResponsesAgentStreamEvent(*, type: str, custom_outputs: dict[str, typing.Any] | None = None, **extra_data: Any)[source]

Stream event for ResponsesAgent. See examples at https://mlflow.org/docs/latest/genai/flavors/responses-agent-intro#streaming-agent-output

Parameters

type (str) – Type of the stream event
custom_outputs (Dict[str, Any]) – An optional param to provide arbitrary additional context from the model. The dictionary values must be JSON-serializable. Optional, defaults to None

class mlflow.types.responses_helpers.Annotation(*, type: str, **extra_data: Any)[source]

class mlflow.types.responses_helpers.AnnotationFileCitation(*, file_id: str, index: int, type: str = 'file_citation')[source]

class mlflow.types.responses_helpers.AnnotationFilePath(*, file_id: str, index: int, type: str = 'file_path')[source]

class mlflow.types.responses_helpers.AnnotationURLCitation(*, end_index: int | None = None, start_index: int | None = None, title: str, type: str = 'url_citation', url: str)[source]

class mlflow.types.responses_helpers.BaseRequestPayload(*, tool_choice: str | mlflow.types.responses_helpers.ToolChoiceFunction | None = None, truncation: str | None = None, max_output_tokens: int | None = None, metadata: dict[str, str] | None = None, parallel_tool_calls: bool | None = None, tools: list[mlflow.types.responses_helpers.Tool] | None = None, reasoning: mlflow.types.responses_helpers.ReasoningParams | None = None, store: bool | None = None, stream: bool | None = None, temperature: float | None = None, text: Optional[Any] = None, top_p: float | None = None, user: str | None = None)[source]

class mlflow.types.responses_helpers.Content(*, type: str, **extra_data: Any)[source]

class mlflow.types.responses_helpers.FunctionCallOutput(*, status: str | None = None, call_id: str, output: str, type: str = 'function_call_output')[source]

class mlflow.types.responses_helpers.FunctionTool(*, name: str, parameters: dict[str, typing.Any], strict: bool | None = None, type: str = 'function', description: str | None = None)[source]

class mlflow.types.responses_helpers.IncompleteDetails(*, reason: str | None = None)[source]

class mlflow.types.responses_helpers.InputTokensDetails(*, cached_tokens: int)[source]

class mlflow.types.responses_helpers.Message(*, status: str | None = None, content: str | list[mlflow.types.responses_helpers.ResponseInputTextParam | dict[str, typing.Any]], role: str, type: str = 'message')[source]

class mlflow.types.responses_helpers.OutputItem(*, type: str, **extra_data: Any)[source]

class mlflow.types.responses_helpers.OutputTokensDetails(*, reasoning_tokens: int)[source]

class mlflow.types.responses_helpers.ReasoningParams(*, effort: str | None = None, generate_summary: str | None = None)[source]

class mlflow.types.responses_helpers.Response(*, tool_choice: str | mlflow.types.responses_helpers.ToolChoiceFunction | None = None, truncation: str | None = None, id: str | None = None, created_at: float | None = None, error: mlflow.types.responses_helpers.ResponseError | None = None, incomplete_details: mlflow.types.responses_helpers.IncompleteDetails | None = None, instructions: str | None = None, metadata: dict[str, str] | None = None, model: str | None = None, object: str = 'response', output: list[mlflow.types.responses_helpers.OutputItem], parallel_tool_calls: bool | None = None, temperature: float | None = None, tools: list[mlflow.types.responses_helpers.Tool] | None = None, top_p: float | None = None, max_output_tokens: int | None = None, previous_response_id: str | None = None, reasoning: mlflow.types.responses_helpers.ReasoningParams | None = None, status: str | None = None, text: Optional[Any] = None, usage: mlflow.types.responses_helpers.ResponseUsage | None = None, user: str | None = None)[source]

property output_text: str

Convenience property that aggregates all output_text items from the output list.

If no output_text content blocks exist, then an empty string is returned.

class mlflow.types.responses_helpers.ResponseCompletedEvent(*, response: mlflow.types.responses_helpers.Response, type: str = 'response.completed')[source]

class mlflow.types.responses_helpers.ResponseError(*, code: str | None = None, message: str)[source]

class mlflow.types.responses_helpers.ResponseErrorEvent(*, code: str | None = None, message: str, param: str | None = None, type: str = 'error')[source]

class mlflow.types.responses_helpers.ResponseFunctionToolCall(*, status: str | None = None, arguments: str, call_id: str, name: str, type: str = 'function_call', id: str | None = None)[source]

class mlflow.types.responses_helpers.ResponseInputTextParam(*, text: str, type: str = 'input_text')[source]

class mlflow.types.responses_helpers.ResponseOutputItemDoneEvent(*, item: mlflow.types.responses_helpers.OutputItem, output_index: int | None = None, type: str = 'response.output_item.done')[source]

class mlflow.types.responses_helpers.ResponseOutputMessage(*, status: str | None = None, id: str, content: list[mlflow.types.responses_helpers.Content], role: str = 'assistant', type: str = 'message')[source]

class mlflow.types.responses_helpers.ResponseOutputRefusal(*, refusal: str, type: str = 'refusal')[source]

class mlflow.types.responses_helpers.ResponseOutputText(*, annotations: list[mlflow.types.responses_helpers.Annotation] | None = None, text: str, type: str = 'output_text')[source]

class mlflow.types.responses_helpers.ResponseReasoningItem(*, status: str | None = None, id: str, summary: list[mlflow.types.responses_helpers.Summary], type: str = 'reasoning')[source]

class mlflow.types.responses_helpers.ResponseTextAnnotationDeltaEvent(*, annotation: mlflow.types.responses_helpers.Annotation, annotation_index: int, content_index: int | None = None, item_id: str, output_index: int | None = None, type: str = 'response.output_text.annotation.added')[source]

class mlflow.types.responses_helpers.ResponseTextDeltaEvent(*, content_index: int | None = None, delta: str, item_id: str, output_index: int | None = None, type: str = 'response.output_text.delta')[source]

class mlflow.types.responses_helpers.ResponseUsage(*, input_tokens: int, input_tokens_details: mlflow.types.responses_helpers.InputTokensDetails, output_tokens: int, output_tokens_details: mlflow.types.responses_helpers.OutputTokensDetails, total_tokens: int)[source]

class mlflow.types.responses_helpers.Status(*, status: str | None = None)[source]

class mlflow.types.responses_helpers.Summary(*, text: str, type: str = 'summary_text')[source]

class mlflow.types.responses_helpers.Tool(*, type: str, **extra_data: Any)[source]

class mlflow.types.responses_helpers.ToolChoice(*, tool_choice: str | mlflow.types.responses_helpers.ToolChoiceFunction | None = None)[source]

class mlflow.types.responses_helpers.ToolChoiceFunction(*, name: str, type: str = 'function')[source]

class mlflow.types.responses_helpers.Truncation(*, truncation: str | None = None)[source]

class mlflow.types.agent.ChatAgentChunk(*, delta: mlflow.types.agent.ChatAgentMessage, finish_reason: str | None = None, custom_outputs: dict[str, typing.Any] | None = None, usage: mlflow.types.chat.ChatUsage | None = None)[source]

Represents a single chunk within the streaming response of a ChatAgent.

Parameters

delta – A ChatAgentMessage representing a single chunk within the list of messages comprising agent output. In particular, clients should assume the content field within this ChatAgentMessage contains only part of the message content, and aggregate message content by ID across chunks. More info can be found in the docstring of ChatAgent.predict_stream.
finish_reason (str) – The reason why generation stopped. Optional defaults to None
custom_outputs (Dict[str, Any]) – An optional param to provide arbitrary additional context from the model. The dictionary values must be JSON-serializable. Optional, defaults to None
usage (mlflow.types.chat.ChatUsage) – The token usage of the request Optional, defaults to None

A message in a ChatAgent model request or response.

Parameters

role (str) – The role of the entity that sent the message (e.g. "user", "system", "assistant", "tool").
content (str) – The content of the message. Optional Can be None if tool_calls is provided.
name (str) – The name of the entity that sent the message. Optional defaults to None
id (str) – The ID of the message. Required when it is either part of a ChatAgentResponse or ChatAgentChunk.
tool_calls (List[mlflow.types.chat.ToolCall]) – A list of tool calls made by the model. Optional defaults to None
tool_call_id (str) – The ID of the tool call that this message is a response to. Optional defaults to None
attachments (Dict[str, str]) – A dictionary of attachments. Optional defaults to None

class mlflow.types.agent.ChatAgentRequest(*, messages: list[mlflow.types.agent.ChatAgentMessage], context: mlflow.types.agent.ChatContext | None = None, custom_inputs: dict[str, typing.Any] | None = None, stream: bool | None = False)[source]

Format of a ChatAgent interface request.

Parameters

messages – A list of ChatAgentMessage that will be passed to the model.
context (ChatContext) – The context to be used in the chat endpoint. Includes conversation_id and user_id. Optional defaults to None
custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional context to the model. The dictionary values must be JSON-serializable. Optional defaults to None
stream (bool) – Whether to stream back responses as they are generated. Optional, defaults to False

class mlflow.types.agent.ChatAgentResponse(*, messages: list[mlflow.types.agent.ChatAgentMessage], finish_reason: str | None = None, custom_outputs: dict[str, typing.Any] | None = None, usage: mlflow.types.chat.ChatUsage | None = None)[source]

Represents the response of a ChatAgent.

Parameters

messages – A list of ChatAgentMessage that are returned from the model.
finish_reason (str) – The reason why generation stopped. Optional defaults to None
custom_outputs (Dict[str, Any]) – An optional param to provide arbitrary additional context from the model. The dictionary values must be JSON-serializable. Optional, defaults to None
usage (mlflow.types.chat.ChatUsage) – The token usage of the request Optional, defaults to None

class mlflow.types.agent.ChatContext(*, conversation_id: str | None = None, user_id: str | None = None)[source]

Context to be used in a ChatAgent endpoint.

Parameters

conversation_id (str) – The ID of the conversation. Optional defaults to None
user_id (str) – The ID of the user. Optional defaults to None

class mlflow.types.llm.ChatChoice(message: mlflow.types.llm.ChatMessage, index: int = 0, finish_reason: str = 'stop', logprobs: Optional[mlflow.types.llm.ChatChoiceLogProbs] = None)[source]

A single chat response generated by the model. ref: https://platform.openai.com/docs/api-reference/chat/object

Parameters

message (ChatMessage) – The message that was generated.
index (int) – The index of the response in the list of responses. Defaults to 0
finish_reason (str) – The reason why generation stopped. Optional, defaults to "stop"
logprobs (ChatChoiceLogProbs) – Log probability information for the choice. Optional, defaults to None

class mlflow.types.llm.ChatChoiceDelta(role: str | None = 'assistant', content: Optional[str] = None, refusal: Optional[str] = None, name: Optional[str] = None, tool_calls: Optional[list[mlflow.types.llm.ToolCall]] = None)[source]

A streaming message delta in a chat response.

Parameters

role (str) – The role of the entity that sent the message (e.g. "user", "system", "assistant", "tool"). Optional defaults to "assistant" This is optional because OpenAI clients can explicitly return None for the role
content (str) – The content of the new token being streamed Optional Can be None on the last delta chunk or if refusal or tool_calls are provided
refusal (str) – The refusal message content. Optional Supplied if a refusal response is provided.
name (str) – The name of the entity that sent the message. Optional.
tool_calls (List[ToolCall]) – A list of tool calls made by the model. Optional defaults to None

class mlflow.types.llm.ChatChoiceLogProbs(content: Optional[list[mlflow.types.llm.TokenLogProb]] = None)[source]

Log probability information for the choice.

Parameters: content – A list of message content tokens with log probability information.

class mlflow.types.llm.ChatChunkChoice(delta: mlflow.types.llm.ChatChoiceDelta, index: int = 0, finish_reason: Optional[str] = None, logprobs: Optional[mlflow.types.llm.ChatChoiceLogProbs] = None)[source]

A single chat response chunk generated by the model. ref: https://platform.openai.com/docs/api-reference/chat/streaming

Parameters

index (int) – The index of the response in the list of responses. defaults to 0
delta (ChatChoiceDelta) – The streaming chunk message that was generated.
finish_reason (str) – The reason why generation stopped. Optional, defaults to None
logprobs (ChatChoiceLogProbs) – Log probability information for the choice. Optional, defaults to None

class mlflow.types.llm.ChatCompletionChunk(choices: list[mlflow.types.llm.ChatChunkChoice], usage: Optional[mlflow.types.llm.TokenUsageStats] = None, id: Optional[str] = None, model: Optional[str] = None, object: str = 'chat.completion.chunk', created: int = <factory>, custom_outputs: Optional[dict[str, typing.Any]] = None)[source]

The streaming chunk returned by the chat endpoint. ref: https://platform.openai.com/docs/api-reference/chat/streaming

Parameters

choices (List[ChatChunkChoice]) – A list of ChatChunkChoice objects containing the generated chunk of a streaming response
usage (TokenUsageStats) – An object describing the tokens used by the request. Optional, defaults to None.
id (str) – The ID of the response. Optional, defaults to None
model (str) – The name of the model used. Optional, defaults to None
object (str) – The object type. Defaults to ‘chat.completion.chunk’
created (int) – The time the response was created. Optional, defaults to the current time.
custom_outputs (Dict[str, Any]) – An field that can contain arbitrary additional context. The dictionary values must be JSON-serializable. Optional, defaults to None

class mlflow.types.llm.ChatCompletionRequest(temperature: float = 1.0, max_tokens: Optional[int] = None, stop: Optional[list[str]] = None, n: int = 1, stream: bool = False, top_p: Optional[float] = None, top_k: Optional[int] = None, frequency_penalty: Optional[float] = None, presence_penalty: Optional[float] = None, custom_inputs: Optional[dict[str, typing.Any]] = None, tools: Optional[list[mlflow.types.llm.ToolDefinition]] = None, messages: list[mlflow.types.llm.ChatMessage] = <factory>)[source]

Format of the request object expected by the chat endpoint.

Parameters

messages (List[ChatMessage]) – A list of ChatMessage that will be passed to the model. Optional, defaults to empty list ([])
temperature (float) – A param used to control randomness and creativity during inference. Optional, defaults to 1.0
max_tokens (int) – The maximum number of new tokens to generate. Optional, defaults to None (unlimited)
stop (List[str]) – A list of tokens at which to stop generation. Optional, defaults to None
n (int) – The number of responses to generate. Optional, defaults to 1
stream (bool) – Whether to stream back responses as they are generated. Optional, defaults to False
top_p (float) – An optional param to control sampling with temperature, the model considers the results of the tokens with top_p probability mass. E.g., 0.1 means only the tokens comprising the top 10% probability mass are considered.
top_k (int) – An optional param for reducing the vocabulary size to top k tokens (sorted in descending order by their probabilities).
frequency_penalty – (float): An optional param of positive or negative value, positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
presence_penalty – (float): An optional param of positive or negative value, positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional context to the model. The dictionary values must be JSON-serializable.
tools (List[ToolDefinition]) – An optional list of tools that can be called by the model.

Warning

In an upcoming MLflow release, default values for temperature, n and stream will be removed. Please provide these values explicitly in your code if needed.

class mlflow.types.llm.ChatCompletionResponse(choices: list[mlflow.types.llm.ChatChoice], usage: Optional[mlflow.types.llm.TokenUsageStats] = None, id: Optional[str] = None, model: Optional[str] = None, object: str = 'chat.completion', created: int = <factory>, custom_outputs: Optional[dict[str, typing.Any]] = None)[source]

The full response object returned by the chat endpoint.

Parameters

choices (List[ChatChoice]) – A list of ChatChoice objects containing the generated responses
usage (TokenUsageStats) – An object describing the tokens used by the request. Optional, defaults to None.
id (str) – The ID of the response. Optional, defaults to None
model (str) – The name of the model used. Optional, defaults to None
object (str) – The object type. Defaults to ‘chat.completion’
created (int) – The time the response was created. Optional, defaults to the current time.
custom_outputs (Dict[str, Any]) – An field that can contain arbitrary additional context. The dictionary values must be JSON-serializable. Optional, defaults to None

class mlflow.types.llm.ChatMessage(role: str, content: Optional[str] = None, refusal: Optional[str] = None, name: Optional[str] = None, tool_calls: Optional[list[mlflow.types.llm.ToolCall]] = None, tool_call_id: Optional[str] = None)[source]

A message in a chat request or response.

Parameters

role (str) – The role of the entity that sent the message (e.g. "user", "system", "assistant", "tool").
content (str) – The content of the message. Optional Can be None if refusal or tool_calls are provided.
refusal (str) – The refusal message content. Optional Supplied if a refusal response is provided.
name (str) – The name of the entity that sent the message. Optional.
tool_calls (List[ToolCall]) – A list of tool calls made by the model. Optional defaults to None
tool_call_id (str) – The ID of the tool call that this message is a response to. Optional defaults to None

class mlflow.types.llm.ChatParams(temperature: float = 1.0, max_tokens: Optional[int] = None, stop: Optional[list[str]] = None, n: int = 1, stream: bool = False, top_p: Optional[float] = None, top_k: Optional[int] = None, frequency_penalty: Optional[float] = None, presence_penalty: Optional[float] = None, custom_inputs: Optional[dict[str, typing.Any]] = None, tools: Optional[list[mlflow.types.llm.ToolDefinition]] = None)[source]

Common parameters used for chat inference

Parameters

temperature (float) – A param used to control randomness and creativity during inference. Optional, defaults to 1.0
max_tokens (int) – The maximum number of new tokens to generate. Optional, defaults to None (unlimited)
stop (List[str]) – A list of tokens at which to stop generation. Optional, defaults to None
n (int) – The number of responses to generate. Optional, defaults to 1
stream (bool) – Whether to stream back responses as they are generated. Optional, defaults to False
top_p (float) – An optional param to control sampling with temperature, the model considers the results of the tokens with top_p probability mass. E.g., 0.1 means only the tokens comprising the top 10% probability mass are considered.
top_k (int) – An optional param for reducing the vocabulary size to top k tokens (sorted in descending order by their probabilities).
frequency_penalty – (float): An optional param of positive or negative value, positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
presence_penalty – (float): An optional param of positive or negative value, positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
custom_inputs (Dict[str, Any]) – An optional param to provide arbitrary additional context to the model. The dictionary values must be JSON-serializable.
tools (List[ToolDefinition]) – An optional list of tools that can be called by the model.

Warning

In an upcoming MLflow release, default values for temperature, n and stream will be removed. Please provide these values explicitly in your code if needed.

classmethod keys() → set[str][source]: Return the keys of the dataclass

class mlflow.types.llm.FunctionToolCallArguments(name: str, arguments: str)[source]

The arguments of a function tool call made by the model.

Parameters

arguments (str) – A JSON string of arguments that should be passed to the tool.
name (str) – The name of the tool that is being called.

class mlflow.types.llm.FunctionToolDefinition(name: str, description: Optional[str] = None, parameters: Optional[mlflow.types.llm.ToolParamsSchema] = None, strict: bool = False)[source]

Definition for function tools (currently the only supported type of tool).

Parameters

name (str) – The name of the tool.
description (str) – A description of what the tool does, and how it should be used. Optional, defaults to None
parameters – A mapping of parameter names to their definitions. If not provided, this defines a function without parameters. Optional, defaults to None
strict (bool) – A flag that represents whether or not the model should strictly follow the schema provided.

to_tool_definition()[source]: Convenience function for wrapping this in a ToolDefinition

class mlflow.types.llm.ParamProperty(type: Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null'], description: Optional[str] = None, enum: Optional[list[str]] = None, items: Optional[mlflow.types.llm.ParamType] = None)[source]

A single parameter within a function definition.

Parameters

type (str) – The type of the parameter. Possible values are “string”, “number”, “integer”, “object”, “array”, “boolean”, or “null”, conforming to the JSON Schema specification.
description (str) – A description of the parameter. Optional, defaults to None
enum (List[str]) – Used to constrain the possible values for the parameter. Optional, defaults to None
items (ParamProperty) – If the param is of array type, this field can be used to specify the type of its items. Optional, defaults to None

class mlflow.types.llm.ParamType(type: Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null'])[source]

class mlflow.types.llm.TokenLogProb(token: str, logprob: float, top_logprobs: list[mlflow.types.llm.TopTokenLogProb], bytes: Optional[list[int]] = None)[source]

Message content token with log probability information.

Parameters

token – The token.
logprob – The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
bytes – A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.
top_logprobs – List of the most likely tokens and their log probability, at this token position. In rare cases, there may be fewer than the number of requested top_logprobs returned.

class mlflow.types.llm.TokenUsageStats(prompt_tokens: Optional[int] = None, completion_tokens: Optional[int] = None, total_tokens: Optional[int] = None)[source]

Stats about the number of tokens used during inference.

Parameters

prompt_tokens (int) – The number of tokens in the prompt. Optional, defaults to None
completion_tokens (int) – The number of tokens in the generated completion. Optional, defaults to None
total_tokens (int) – The total number of tokens used. Optional, defaults to None

class mlflow.types.llm.ToolCall(function: mlflow.types.llm.FunctionToolCallArguments, id: str = <factory>, type: str = 'function')[source]

A tool call made by the model.

Parameters

function (FunctionToolCallArguments) – The arguments of the function tool call.
id (str) – The ID of the tool call. Defaults to a random UUID.
type (str) – The type of the object. Defaults to “function”.

class mlflow.types.llm.ToolDefinition(function: mlflow.types.llm.FunctionToolDefinition, type: Literal['function'] = 'function')[source]

Definition for tools that can be called by the model.

Parameters

function (FunctionToolDefinition) – The definition of a function tool.
type (str) – The type of the tool. Currently only “function” is supported.

class mlflow.types.llm.ToolParamsSchema(properties: dict[str, mlflow.types.llm.ParamProperty], type: Literal['object'] = 'object', required: Optional[list[str]] = None, additionalProperties: Optional[bool] = None)[source]

A tool parameter definition.

Parameters

properties (Dict[str, ParamProperty]) – A mapping of parameter names to their definitions.
type (str) – The type of the parameter. Currently only “object” is supported.
required (List[str]) – A list of required parameter names. Optional, defaults to None
additionalProperties (bool) – Whether additional properties are allowed in the object. Optional, defaults to None

class mlflow.types.llm.TopTokenLogProb(token: str, logprob: float, bytes: Optional[list[int]] = None)[source]

Token and its log probability.

Parameters

token – The token.
logprob – The log probability of this token, if it is within the top 20 most likely tokens. Otherwise, the value -9999.0 is used to signify that the token is very unlikely.
bytes – A list of integers representing the UTF-8 bytes representation of the token. Useful in instances where characters are represented by multiple tokens and their byte representations must be combined to generate the correct text representation. Can be null if there is no bytes representation for the token.

class mlflow.types.chat.AudioContentPart(*, type: Literal['input_audio'], input_audio: mlflow.types.chat.InputAudio)[source]

class mlflow.types.chat.BaseModel[source]

class mlflow.types.chat.BaseRequestPayload(*, temperature: float = 0.0, n: int = 1, stop: list[str] | None = None, max_tokens: int | None = None, stream: bool | None = None, stream_options: dict[str, typing.Any] | None = None, model: str | None = None)[source]: Common parameters used for chat completions and completion endpoints.

class mlflow.types.chat.ChatChoice(*, index: int, message: mlflow.types.chat.ChatMessage, finish_reason: str | None = None)[source]

class mlflow.types.chat.ChatChoiceDelta(*, role: str | None = None, content: str | None = None)[source]

class mlflow.types.chat.ChatChunkChoice(*, index: int, finish_reason: str | None = None, delta: mlflow.types.chat.ChatChoiceDelta)[source]

class mlflow.types.chat.ChatCompletionChunk(*, id: str | None = None, object: str = 'chat.completion.chunk', created: int, model: str, choices: list[mlflow.types.chat.ChatChunkChoice])[source]: A chunk of a chat completion stream response.

class mlflow.types.chat.ChatCompletionRequest(*, temperature: float = 0.0, n: int = 1, stop: list[str] | None = None, max_tokens: int | None = None, stream: bool | None = None, stream_options: dict[str, typing.Any] | None = None, model: str | None = None, messages: list[mlflow.types.chat.ChatMessage], tools: list[mlflow.types.chat.ChatTool] | None = None)[source]

A request to the chat completion API.

Must be compatible with OpenAI’s Chat Completion API. https://platform.openai.com/docs/api-reference/chat

class mlflow.types.chat.ChatCompletionResponse(*, id: str | None = None, object: str = 'chat.completion', created: int, model: str, choices: list[mlflow.types.chat.ChatChoice], usage: mlflow.types.chat.ChatUsage)[source]

A response from the chat completion API.

Must be compatible with OpenAI’s Chat Completion API. https://platform.openai.com/docs/api-reference/chat

class mlflow.types.chat.ChatMessage(*, role: str, content: Optional[str | list[typing.Annotated[mlflow.types.chat.TextContentPart | mlflow.types.chat.ImageContentPart | mlflow.types.chat.AudioContentPart, FieldInfo(annotation=NoneType, required=True, discriminator='type')]]] = None, tool_calls: list[mlflow.types.chat.ToolCall] | None = None, refusal: str | None = None, tool_call_id: str | None = None)[source]

A chat request. content can be a string, or an array of content parts.

A content part is one of the following:

TextContentPart
ImageContentPart
AudioContentPart

class mlflow.types.chat.ChatTool(*, type: Literal['function'], function: mlflow.types.chat.FunctionToolDefinition | None = None)[source]

A tool definition passed to the chat completion API.

Ref: https://platform.openai.com/docs/guides/function-calling

class mlflow.types.chat.ChatUsage(*, prompt_tokens: int | None = None, completion_tokens: int | None = None, total_tokens: int | None = None)[source]

class mlflow.types.chat.Function(*, name: str, arguments: str)[source]

class mlflow.types.chat.FunctionParams(*, properties: dict[str, mlflow.types.chat.ParamProperty], type: Literal['object'] = 'object', required: list[str] | None = None, additionalProperties: bool | None = None)[source]

class mlflow.types.chat.FunctionToolDefinition(*, name: str, description: str | None = None, parameters: mlflow.types.chat.FunctionParams | None = None, strict: bool | None = None)[source]

class mlflow.types.chat.ImageContentPart(*, type: Literal['image_url'], image_url: mlflow.types.chat.ImageUrl)[source]

class mlflow.types.chat.ImageUrl(*, url: str, detail: Optional[Literal['auto', 'low', 'high']] = None)[source]

Represents an image URL.

url

Either a URL of an image or base64 encoded data. https://platform.openai.com/docs/guides/vision?lang=curl#uploading-base64-encoded-images

Type: str

detail

The level of resolution for the image when the model receives it. For example, when set to “low”, the model will see a image resized to 512x512 pixels, which consumes fewer tokens. In OpenAI, this is optional and defaults to “auto”. https://platform.openai.com/docs/guides/vision?lang=curl#low-or-high-fidelity-image-understanding

Type: Optional[Literal[‘auto’, ‘low’, ‘high’]]

class mlflow.types.chat.InputAudio(*, data: str, format: Literal['wav', 'mp3'])[source]

class mlflow.types.chat.ParamProperty(*, type: Optional[Union[Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null'], list[typing.Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null']]]] = None, description: str | None = None, enum: list[str | int | float | bool] | None = None, items: mlflow.types.chat.ParamType | None = None)[source]

OpenAI uses JSON Schema (https://json-schema.org/) for function parameters. See OpenAI function calling reference: https://platform.openai.com/docs/guides/function-calling?&api-mode=responses#defining-functions

JSON Schema enum supports any JSON type (str, int, float, bool, null, arrays, objects), but we restrict to basic scalar types for practical use cases and API safety.

class mlflow.types.chat.ParamType(*, type: Optional[Union[Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null'], list[typing.Literal['string', 'number', 'integer', 'object', 'array', 'boolean', 'null']]]] = None)[source]

class mlflow.types.chat.TextContentPart(*, type: Literal['text'], text: str)[source]

class mlflow.types.chat.ToolCall(*, id: str, type: str = 'function', function: mlflow.types.chat.Function)[source]

class mlflow.types.schema.AnyType[source]

to_dict()[source]: Dictionary representation of the object.

class mlflow.types.schema.Array(dtype: Union[mlflow.types.schema.Array, mlflow.types.schema.DataType, mlflow.types.schema.Map, mlflow.types.schema.Object, mlflow.types.schema.AnyType, str])[source]

Specification used to represent a json-convertible array.

property dtype: Array' | DataType | Object | 'Map' | 'AnyType: The array data type.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain type and items keys. Example: {“type”: “array”, “items”: “string”}

to_dict()[source]: Dictionary representation of the object.

class mlflow.types.schema.Map(value_type: Union[mlflow.types.schema.Array, mlflow.types.schema.DataType, mlflow.types.schema.Map, mlflow.types.schema.Object, mlflow.types.schema.AnyType, str])[source]

Specification used to represent a json-convertible map with string type keys.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain type and values keys. Example: {“type”: “map”, “values”: “string”}

to_dict()[source]: Dictionary representation of the object.

property value_type

class mlflow.types.schema.Object(properties: list[mlflow.types.schema.Property])[source]

Specification used to represent a json-convertible object.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain type and properties keys. Example: {“type”: “object”, “properties”: {“property_name”: {“type”: “string”}}}

property properties: list[mlflow.types.schema.Property]: The list of object properties

to_dict()[source]: Dictionary representation of the object.

class mlflow.types.schema.Property(name: str, dtype: Union[mlflow.types.schema.Array, mlflow.types.schema.DataType, mlflow.types.schema.Map, mlflow.types.schema.Object, mlflow.types.schema.AnyType, str], required: bool = True)[source]

Specification used to represent a json-convertible object property.

property dtype: DataType | 'Array' | 'Object' | 'Map': The property data type.

classmethod from_json_dict(**kwargs)[source]: Deserialize from a json loaded dictionary. The dictionary is expected to contain only one key as name, and the value should be a dictionary containing type and optional required keys. Example: {“property_name”: {“type”: “string”, “required”: True}}

property name: str: The property name.

property required: bool: Whether this property is required

to_dict()[source]: Dictionary representation of the object.