mlflow.types

The mlflow.types module defines data types and utilities to be used by other mlflow components to describe interface independent of other frameworks or languages.

class mlflow.types.ColSpec(type: mlflow.types.schema.DataType, name: Optional[str] = None)[source]

Bases: object

Specification of name and type of a single column in a dataset.

property name

The column name or None if the columns is unnamed.

property type

The column data type.

class mlflow.types.DataType(value)[source]

Bases: enum.Enum

MLflow data types.

binary = 7

Sequence of raw bytes.

boolean = 1

Logical data (True, False) .

double = 5

64b floating point numbers.

float = 4

32b floating point numbers.

integer = 2

32b signed integer numbers.

long = 3

64b signed integer numbers.

string = 6

Text data.

to_numpy()numpy.dtype[source]

Get equivalent numpy data type.

to_pandas()numpy.dtype[source]

Get equivalent pandas data type.

class mlflow.types.Schema(inputs: List[Union[mlflow.types.schema.ColSpec, mlflow.types.schema.TensorSpec]])[source]

Bases: object

Specification of a dataset.

Schema is represented as a list of ColSpec or TensorSpec. A combination of ColSpec and TensorSpec is not allowed.

The dataset represented by a schema can be named, with unique non empty names for every input. In the case of ColSpec, the dataset columns can be unnamed with implicit integer index defined by their list indices. Combination of named and unnamed data inputs are not allowed.

as_spark_schema()[source]

Convert to Spark schema. If this schema is a single unnamed column, it is converted directly the corresponding spark data type, otherwise it’s returned as a struct (missing column names are filled with an integer sequence). Unsupported by TensorSpec.

column_names()List[Union[str, int]][source]

Warning

mlflow.types.schema.column_names is deprecated since 1.14. This method will be removed in a near future release. Use mlflow.types.Schema.input_names instead.

Deprecated since version 1.14: Please use mlflow.types.Schema.input_names() Get list of column names or range of indices if the schema has no column names.

column_types()List[mlflow.types.schema.DataType][source]

Warning

mlflow.types.schema.column_types is deprecated since 1.14. This method will be removed in a near future release. Use mlflow.types.Schema.input_types instead.

Deprecated since version 1.14: Please use mlflow.types.Schema.input_types() Get types of the represented dataset. Unsupported by TensorSpec.

property columns

Warning

mlflow.types.schema.columns is deprecated since 1.14. This method will be removed in a near future release. Use mlflow.types.Schema.inputs instead.

Deprecated since version 1.14: Please use mlflow.types.Schema.inputs() The list of columns that defines this schema.

classmethod from_json(json_str: str)[source]

Deserialize from a json string.

has_column_names()bool[source]

Warning

mlflow.types.schema.has_column_names is deprecated since 1.14. This method will be removed in a near future release. Use mlflow.types.Schema.has_input_names instead.

Deprecated since version 1.14: Please use mlflow.types.Schema.has_input_names() Return true iff this schema declares column names, false otherwise.

has_input_names()bool[source]

Return true iff this schema declares names, false otherwise.

input_names()List[Union[str, int]][source]

Get list of data names or range of indices if the schema has no names.

input_types()List[Union[mlflow.types.schema.DataType, numpy.dtype]][source]

Get types of the represented dataset.

property inputs

Representation of a dataset that defines this schema.

is_tensor_spec()bool[source]

Return true iff this schema is specified using TensorSpec

numpy_types()List[numpy.dtype][source]

Convenience shortcut to get the datatypes as numpy types.

pandas_types()List[numpy.dtype][source]

Convenience shortcut to get the datatypes as pandas types. Unsupported by TensorSpec.

to_dict()List[Dict[str, Any]][source]

Serialize into a jsonable dictionary.

to_json()str[source]

Serialize into json string.

class mlflow.types.TensorSpec(type: numpy.dtype, shape: Union[tuple, list], name: Optional[str] = None)[source]

Bases: object

Specification used to represent a dataset stored as a Tensor.

classmethod from_json_dict(**kwargs)[source]

Deserialize from a json loaded dictionary. The dictionary is expected to contain type and tensor-spec keys.

property name

The tensor name or None if the tensor is unnamed.

property shape

The tensor shape

property type

A unique character code for each of the 21 different numpy built-in types. See https://numpy.org/devdocs/reference/generated/numpy.dtype.html#numpy.dtype for details.