MLflow Models

An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools—for example, real-time serving through a REST API or batch inference on Apache Spark. The format defines a convention that lets you save a model in different “flavors” that can be understood by different downstream tools.

Table of Contents

Storage Format
Model Signature And Input Example
Model API
Built-In Model Flavors
Model Customization
Built-In Deployment Tools
Deployment to Custom Targets
Community Model Flavors

Storage Format

Each MLflow Model is a directory containing arbitrary files, together with an MLmodel file in the root of the directory that can define multiple flavors that the model can be viewed in.

Flavors are the key concept that makes MLflow Models powerful: they are a convention that deployment tools can use to understand the model, which makes it possible to write tools that work with models from any ML library without having to integrate each tool with each library. MLflow defines several “standard” flavors that all of its built-in deployment tools support, such as a “Python function” flavor that describes how to run the model as a Python function. However, libraries can also define and use other flavors. For example, MLflow’s mlflow.sklearn library allows loading models back as a scikit-learn Pipeline object for use in code that is aware of scikit-learn, or as a generic Python function for use in tools that just need to apply the model (for example, the mlflow sagemaker tool for deploying models to Amazon SageMaker).

All of the flavors that a particular model supports are defined in its MLmodel file in YAML format. For example, mlflow.sklearn outputs models as follows:

# Directory written by mlflow.sklearn.save_model(model, "my_model")
my_model/
├── MLmodel
├── model.pkl
├── conda.yaml
└── requirements.txt

And its MLmodel file describes two flavors:

time_created: 2018-05-25T17:28:53.35

flavors:
  sklearn:
    sklearn_version: 0.19.1
    pickled_model: model.pkl
  python_function:
    loader_module: mlflow.sklearn

This model can then be used with any tool that supports either the sklearn or python_function model flavor. For example, the mlflow models serve command can serve a model with the python_function or the crate (R Function) flavor:

mlflow models serve -m my_model

In addition, the mlflow sagemaker command-line tool can package and deploy models to AWS SageMaker as long as they support the python_function flavor:

mlflow sagemaker deploy -m my_model [other options]

Fields in the MLmodel Format

Apart from a flavors field listing the model flavors, the MLmodel YAML format can contain the following fields:

time_created: Date and time when the model was created, in UTC ISO 8601 format.
run_id: ID of the run that created the model, if the model was saved using MLflow Tracking.
signature: model signature in JSON format.
input_example: reference to an artifact with input example.
databricks_runtime: Databricks runtime version and type, if the model was trained in a Databricks notebook or job.

Additional Logged Files

For environment recreation, we automatically log conda.yaml and requirements.txt files whenever a model is logged. These files can then be used to reinstall dependencies using either conda or pip.

conda.yaml: When saving a model, MLflow provides the option to pass in a conda environment parameter that can contain dependencies used by the model. If no conda environment is provided, a default environment is created based on the flavor of the model. This conda environment is then saved in conda.yaml.
requirements.txt: The requirements file is created from the pip portion of the conda.yaml environment specification. Additional pip dependencies can be added to requirements.txt by including them as a pip dependency in a conda environment and logging the model with the environment.

The following shows an example of saving a model with a manually specified conda environment and the corresponding content of the generated conda.yaml and requirements.txt files.

conda_env = {
    'channels': ['conda-forge'],
    'dependencies': [
        'python=3.8.8',
        'pip'],
    'pip': [
        'mlflow',
        'scikit-learn==0.23.2',
        'cloudpickle==1.6.0'
    ],
    'name': 'mlflow-env'
}
mlflow.sklearn.log_model(model, "my_model", conda_env=conda_env)

The written conda.yaml file:

channels:
  - conda-forge
  dependencies:
  - python=3.8.8
  - pip
  - pip:
    - mlflow
    - scikit-learn==0.23.2
    - cloudpickle==1.6.0
name: mlflow-env

The written requirements.txt file:

mlflow
scikit-learn==0.23.2
cloudpickle==1.6.0

Model Signature And Input Example

When working with ML models you often need to know some basic functional properties of the model at hand, such as “What inputs does it expect?” and “What output does it produce?”. MLflow models can include the following additional metadata about model inputs and outputs that can be used by downstream tooling:

Model Signature - description of a model’s inputs and outputs.
Model Input Example - example of a valid model input.

Model Signature

The Model signature defines the schema of a model’s inputs and outputs. Model inputs and outputs can be either column-based or tensor-based. Column-based inputs and outputs can be described as a sequence of (optionally) named columns with type specified as one of the MLflow data types. Tensor-based inputs and outputs can be described as a sequence of (optionally) named tensors with type specified as one of the numpy data types. The signature is stored in JSON format in the MLmodel file, together with other model metadata.

Model signatures are recognized and enforced by standard MLflow model deployment tools. For example, the mlflow models serve tool, which deploys a model as a REST API, validates inputs based on the model’s signature.

Column-based Signature Example

All flavors support column-based signatures.

Each column-based input and output is represented by a type corresponding to one of MLflow data types and an optional name. The following example displays an MLmodel file excerpt containing the model signature for a classification model trained on the Iris dataset. The input has 4 named, numeric columns. The output is an unnamed integer specifying the predicted class.

signature:
    inputs: '[{"name": "sepal length (cm)", "type": "double"}, {"name": "sepal width
      (cm)", "type": "double"}, {"name": "petal length (cm)", "type": "double"}, {"name":
      "petal width (cm)", "type": "double"}]'
    outputs: '[{"type": "integer"}]'

Tensor-based Signature Example

Only DL flavors support tensor-based signatures (i.e TensorFlow, Keras, PyTorch, Onnx, and Gluon).

Each tensor-based input and output is represented by a dtype corresponding to one of numpy data types, shape and an optional name. When specifying the shape, -1 is used for axes that may be variable in size. The following example displays an MLmodel file excerpt containing the model signature for a classification model trained on the MNIST dataset. The input has one named tensor where input sample is an image represented by a 28 × 28 × 1 array of float32 numbers. The output is an unnamed tensor that has 10 units specifying the likelihood corresponding to each of the 10 classes. Note that the first dimension of the input and the output is the batch size and is thus set to -1 to allow for variable batch sizes.

signature:
    inputs: '[{"name": "images", "dtype": "uint8", "shape": [-1, 28, 28, 1]}]'
    outputs: '[{"shape": [-1, 10], "dtype": "float32"}]'

Signature Enforcement

Schema enforcement checks the provided input against the model’s signature and raises an exception if the input is not compatible. This enforcement is applied in MLflow before calling the underlying model implementation. Note that this enforcement only applies when using MLflow model deployment tools or when loading models as python_function. In particular, it is not applied to models that are loaded in their native format (e.g. by calling mlflow.sklearn.load_model()).

Name Ordering Enforcement

The input names are checked against the model signature. If there are any missing inputs, MLflow will raise an exception. Extra inputs that were not declared in the signature will be ignored. If the input schema in the signature defines input names, input matching is done by name and the inputs are reordered to match the signature. If the input schema does not have input names, matching is done by position (i.e. MLflow will only check the number of inputs).

Input Type Enforcement

The input types are checked against the signature.

For models with column-based signatures (i.e DataFrame inputs), MLflow will perform safe type conversions if necessary. Generally, only conversions that are guaranteed to be lossless are allowed. For example, int -> long or int -> double conversions are ok, long -> double is not. If the types cannot be made compatible, MLflow will raise an error.

For models with tensor-based signatures, type checking is strict (i.e an exception will be thrown if the input type does not match the type specified by the schema).

Handling Integers With Missing Values

Integer data with missing values is typically represented as floats in Python. Therefore, data types of integer columns in Python can vary depending on the data sample. This type variance can cause schema enforcement errors at runtime since integer and float are not compatible types. For example, if your training data did not have any missing values for integer column c, its type will be integer. However, when you attempt to score a sample of the data that does include a missing value in column c, its type will be float. If your model signature specified c to have integer type, MLflow will raise an error since it can not convert float to int. Note that MLflow uses python to serve models and to deploy models to Spark, so this can affect most model deployments. The best way to avoid this problem is to declare integer columns as doubles (float64) whenever there can be missing values.

Handling Date and Timestamp

For datetime values, Python has precision built into the type. For example, datetime values with day precision have NumPy type datetime64[D], while values with nanosecond precision have type datetime64[ns]. Datetime precision is ignored for column-based model signature but is enforced for tensor-based signatures.

How To Log Models With Signatures

To include a signature with your model, pass signature object as an argument to the appropriate log_model call, e.g. sklearn.log_model(). The model signature object can be created by hand or inferred from datasets with valid model inputs (e.g. the training dataset with target column omitted) and valid model outputs (e.g. model predictions generated on the training dataset).

Column-based Signature Example

The following example demonstrates how to store a model signature for a simple classifier trained on the Iris dataset:

import pandas as pd
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
import mlflow
import mlflow.sklearn
from mlflow.models.signature import infer_signature

iris = datasets.load_iris()
iris_train = pd.DataFrame(iris.data, columns=iris.feature_names)
clf = RandomForestClassifier(max_depth=7, random_state=0)
clf.fit(iris_train, iris.target)
signature = infer_signature(iris_train, clf.predict(iris_train))
mlflow.sklearn.log_model(clf, "iris_rf", signature=signature)

The same signature can be created explicitly as follows:

from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, ColSpec

input_schema = Schema([
  ColSpec("double", "sepal length (cm)"),
  ColSpec("double", "sepal width (cm)"),
  ColSpec("double", "petal length (cm)"),
  ColSpec("double", "petal width (cm)"),
])
output_schema = Schema([ColSpec("long")])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

Tensor-based Signature Example

The following example demonstrates how to store a model signature for a simple classifier trained on the MNIST dataset:

from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from keras.optimizers import SGD
import mlflow
import mlflow.keras
from mlflow.models.signature import infer_signature

(train_X, train_Y), (test_X, test_Y) = mnist.load_data()
trainX = train_X.reshape((train_X.shape[0], 28, 28, 1))
testX = test_X.reshape((test_X.shape[0], 28, 28, 1))
trainY = to_categorical(train_Y)
testY = to_categorical(test_Y)

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))
opt = SGD(lr=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY))

signature = infer_signature(testX, model.predict(testX))
mlflow.keras.log_model(model, "mnist_cnn", signature=signature)

The same signature can be created explicitly as follows:

import numpy as np
from mlflow.models.signature import ModelSignature
from mlflow.types.schema import Schema, TensorSpec

input_schema = Schema([
  TensorSpec(np.dtype(np.uint8), (-1, 28, 28, 1)),
])
output_schema = Schema([TensorSpec(np.dtype(np.float32), (-1, 10))])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

Model Input Example

Similar to model signatures, model inputs can be column-based (i.e DataFrames) or tensor-based (i.e numpy.ndarrays). A model input example provides an instance of a valid model input. Input examples are stored with the model as separate artifacts and are referenced in the the MLmodel file.

To include an input example with your model, add it to the appropriate log_model call, e.g. sklearn.log_model().

How To Log Model With Column-based Example

For models accepting column-based inputs, an example can be a single record or a batch of records. The sample input can be passed in as a Pandas DataFrame, list or dictionary. The given example will be converted to a Pandas DataFrame and then serialized to json using the Pandas split-oriented format. Bytes are base64-encoded. The following example demonstrates how you can log a column-based input example with your model:

input_example = {
  "sepal length (cm)": 5.1,
  "sepal width (cm)": 3.5,
  "petal length (cm)": 1.4,
  "petal width (cm)": 0.2
}
mlflow.sklearn.log_model(..., input_example=input_example)

How To Log Model With Tensor-based Example

For models accepting tensor-based inputs, an example must be a batch of inputs. By default, the axis 0 is the batch axis unless specified otherwise in the model signature. The sample input can be passed in as a numpy ndarray or a dictionary mapping a string to a numpy array. The following example demonstrates how you can log a tensor-based input example with your model:

# each input has shape (4, 4)
input_example = np.array([
   [[  0,   0,   0,   0],
    [  0, 134,  25,  56],
    [253, 242, 195,   6],
    [  0,  93,  82,  82]],
   [[  0,  23,  46,   0],
    [ 33,  13,  36, 166],
    [ 76,  75,   0, 255],
    [ 33,  44,  11,  82]]
], dtype=np.uint8)
mlflow.keras.log_model(..., input_example=input_example)

Model API

You can save and load MLflow Models in multiple ways. First, MLflow includes integrations with several common libraries. For example, mlflow.sklearn contains save_model, log_model, and load_model functions for scikit-learn models. Second, you can use the mlflow.models.Model class to create and write models. This class has four key functions:

add_flavor to add a flavor to the model. Each flavor has a string name and a dictionary of key-value attributes, where the values can be any object that can be serialized to YAML.
save to save the model to a local directory.
log to log the model as an artifact in the current run using MLflow Tracking.
load to load a model from a local directory or from an artifact in a previous run.

Built-In Model Flavors

MLflow provides several standard flavors that might be useful in your applications. Specifically, many of its deployment tools support these flavors, so you can export your own model in one of these flavors to benefit from all these tools:

Python Function (python_function)
R Function (crate)
H₂O (h2o)
Keras (keras)
MLeap (mleap)
PyTorch (pytorch)
Scikit-learn (sklearn)
Spark MLlib (spark)
TensorFlow (tensorflow)
ONNX (onnx)
MXNet Gluon (gluon)
XGBoost (xgboost)
LightGBM (lightgbm)
CatBoost (catboost)
Spacy(spaCy)
Fastai(fastai)
Statsmodels (statsmodels)
Prophet (prophet)

Python Function (`python_function`)

The python_function model flavor serves as a default model interface for MLflow Python models. Any MLflow Python model is expected to be loadable as a python_function model. This enables other MLflow tools to work with any python model regardless of which persistence module or framework was used to produce the model. This interoperability is very powerful because it allows any Python model to be productionized in a variety of environments.

In addition, the python_function model flavor defines a generic filesystem model format for Python models and provides utilities for saving and loading models to and from this format. The format is self-contained in the sense that it includes all the information necessary to load and use a model. Dependencies are stored either directly with the model or referenced via conda environment. This model format allows other tools to integrate their models with MLflow.

How To Save Model As Python Function

Most python_function models are saved as part of other model flavors - for example, all mlflow built-in flavors include the python_function flavor in the exported models. In addition, the mlflow.pyfunc module defines functions for creating python_function models explicitly. This module also includes utilities for creating custom Python models, which is a convenient way of adding custom python code to ML models. For more information, see the custom Python models documentation.

How To Load And Score Python Function Models

You can load python_function models in Python by calling the mlflow.pyfunc.load_model() function. Note that the load_model function assumes that all dependencies are already available and will not check nor install any dependencies ( see model deployment section for tools to deploy models with automatic dependency management).

Once loaded, you can score the model by calling the predict method, which has the following signature:

predict(model_input: [pandas.DataFrame, numpy.ndarray, Dict[str, np.ndarray]]) -> [numpy.ndarray | pandas.(Series | DataFrame)]

All PyFunc models will support pandas.DataFrame as an input. In addition to pandas.DataFrame, DL PyFunc models will also support tensor inputs in the form of numpy.ndarrays. To verify whether a model flavor supports tensor inputs, please check the flavor’s documentation.

For models with a column-based schema, inputs are typically provided in the form of a pandas.DataFrame. If a dictionary mapping column name to values is provided as input for schemas with named columns or if a python List or a numpy.ndarray is provided as input for schemas with unnamed columns, MLflow will cast the input to a DataFrame. Schema enforcement and casting with respect to the expected data types is performed against the DataFrame.

For models with a tensor-based schema, inputs are typically provided in the form of a numpy.ndarray or a dictionary mapping the tensor name to its np.ndarray value. Schema enforcement will check the provided input’s shape and type against the shape and type specified in the model’s schema and throw an error if they do not match.

For models where no schema is defined, no changes to the model inputs and outputs are made. MLflow will propogate any errors raised by the model if the model does not accept the provided input type.

R Function (`crate`)

The crate model flavor defines a generic model format for representing an arbitrary R prediction function as an MLflow model using the crate function from the carrier package. The prediction function is expected to take a dataframe as input and produce a dataframe, a vector or a list with the predictions as output.

This flavor requires R to be installed in order to be used.

H₂O (`h2o`)

The h2o model flavor enables logging and loading H2O models.

The mlflow.h2o module defines save_model() and log_model() methods in python, and mlflow_save_model and mlflow_log_model in R for saving H2O models in MLflow Model format. These methods produce MLflow Models with the python_function flavor, allowing you to load them as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can be scored with only DataFrame input. When you load MLflow Models with the h2o flavor using mlflow.pyfunc.load_model(), the h2o.init() method is called. Therefore, the correct version of h2o(-py) must be installed in the loader’s environment. You can customize the arguments given to h2o.init() by modifying the init entry of the persisted H2O model’s YAML configuration file: model.h2o/h2o.yaml.

Finally, you can use the mlflow.h2o.load_model() method to load MLflow Models with the h2o flavor as H2O model objects.

For more information, see mlflow.h2o.

Keras (`keras`)

The keras model flavor enables logging and loading Keras models. It is available in both Python and R clients. The mlflow.keras module defines save_model() and log_model() functions that you can use to save Keras models in MLflow Model format in Python. Similarly, in R, you can save or log the model using mlflow_save_model and mlflow_log_model. These functions serialize Keras models as HDF5 files using the Keras library’s built-in model persistence functions. MLflow Models produced by these functions also contain the python_function flavor, allowing them to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can be scored with both DataFrame input and numpy array input. Finally, you can use the mlflow.keras.load_model() function in Python or mlflow_load_model function in R to load MLflow Models with the keras flavor as Keras Model objects.

For more information, see mlflow.keras.

MLeap (`mleap`)

The mleap model flavor supports saving Spark models in MLflow format using the MLeap persistence mechanism. MLeap is an inference-optimized format and execution engine for Spark models that does not depend on SparkContext to evaluate inputs.

You can save Spark models in MLflow format with the mleap flavor by specifying the sample_input argument of the mlflow.spark.save_model() or mlflow.spark.log_model() method (recommended). The mlflow.mleap module also defines save_model() and log_model() methods for saving MLeap models in MLflow format, but these methods do not include the python_function flavor in the models they produce. Similarly, mleap models can be saved in R with mlflow_save_model and loaded with mlflow_load_model, with mlflow_save_model requiring sample_input to be specified as a sample Spark dataframe containing input data to the model is required by MLeap for data schema inference.

A companion module for loading MLflow Models with the MLeap flavor is available in the mlflow/java package.

For more information, see mlflow.spark, mlflow.mleap, and the MLeap documentation.

PyTorch (`pytorch`)

The pytorch model flavor enables logging and loading PyTorch models.

The mlflow.pytorch module defines utilities for saving and loading MLflow Models with the pytorch flavor. You can use the mlflow.pytorch.save_model() and mlflow.pytorch.log_model() methods to save PyTorch models in MLflow format; both of these functions use the torch.save() method to serialize PyTorch models. Additionally, you can use the mlflow.pytorch.load_model() method to load MLflow Models with the pytorch flavor as PyTorch model objects. This loaded PyFunc model can be scored with both DataFrame input and numpy array input. Finally, models produced by mlflow.pytorch.save_model() and mlflow.pytorch.log_model() contain the python_function flavor, allowing you to load them as generic Python functions for inference via mlflow.pyfunc.load_model().

For more information, see mlflow.pytorch.

Scikit-learn (`sklearn`)

The sklearn model flavor provides an easy-to-use interface for saving and loading scikit-learn models. The mlflow.sklearn module defines save_model() and log_model() functions that save scikit-learn models in MLflow format, using either Python’s pickle module (Pickle) or CloudPickle for model serialization. These functions produce MLflow Models with the python_function flavor, allowing them to be loaded as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. Finally, you can use the mlflow.sklearn.load_model() method to load MLflow Models with the sklearn flavor as scikit-learn model objects.

For more information, see mlflow.sklearn.

Spark MLlib (`spark`)

The spark model flavor enables exporting Spark MLlib models as MLflow Models.

The mlflow.spark module defines save_model() and log_model() methods that save Spark MLlib pipelines in MLflow model format. MLflow Models produced by these functions contain the python_function flavor, allowing you to load them as generic Python functions via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. When a model with the spark flavor is loaded as a Python function via mlflow.pyfunc.load_model(), a new SparkContext is created for model inference; additionally, the function converts all Pandas DataFrame inputs to Spark DataFrames before scoring. While this initialization overhead and format translation latency is not ideal for high-performance use cases, it enables you to easily deploy any MLlib PipelineModel to any production environment supported by MLflow (SageMaker, AzureML, etc).

Finally, the mlflow.spark.load_model() method is used to load MLflow Models with the spark flavor as Spark MLlib pipelines.

For more information, see mlflow.spark.

TensorFlow (`tensorflow`)

The tensorflow model flavor allows serialized TensorFlow models in SavedModel format to be logged in MLflow format via the mlflow.tensorflow.save_model() and mlflow.tensorflow.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can be scored with both DataFrame input and numpy array input. Finally, you can use the mlflow.tensorflow.load_model() method to load MLflow Models with the tensorflow flavor as TensorFlow graphs.

For more information, see mlflow.tensorflow.

ONNX (`onnx`)

The onnx model flavor enables logging of ONNX models in MLflow format via the mlflow.onnx.save_model() and mlflow.onnx.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can be scored with both DataFrame input and numpy array input. The python_function representation of an MLflow ONNX model uses the ONNX Runtime execution engine for evaluation. Finally, you can use the mlflow.onnx.load_model() method to load MLflow Models with the onnx flavor in native ONNX format.

For more information, see mlflow.onnx and http://onnx.ai/.

MXNet Gluon (`gluon`)

The gluon model flavor enables logging of Gluon models in MLflow format via the mlflow.gluon.save_model() and mlflow.gluon.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can be scored with both DataFrame input and numpy array input. You can also use the mlflow.gluon.load_model() method to load MLflow Models with the gluon flavor in native Gluon format.

For more information, see mlflow.gluon.

XGBoost (`xgboost`)

The xgboost model flavor enables logging of XGBoost models in MLflow format via the mlflow.xgboost.save_model() and mlflow.xgboost.log_model() methods in python and mlflow_save_model and mlflow_log_model in R respectively. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.xgboost.load_model() method to load MLflow Models with the xgboost model flavor in native XGBoost format.

Note that the xgboost model flavor only supports an instance of xgboost.Booster, not models that implement the scikit-learn API.

For more information, see mlflow.xgboost.

LightGBM (`lightgbm`)

The lightgbm model flavor enables logging of LightGBM models in MLflow format via the mlflow.lightgbm.save_model() and mlflow.lightgbm.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.lightgbm.load_model() method to load MLflow Models with the lightgbm model flavor in native LightGBM format.

Note that the lightgbm model flavor only supports an instance of lightgbm.Booster, not models that implement the scikit-learn API.

For more information, see mlflow.lightgbm.

CatBoost (`catboost`)

The catboost model flavor enables logging of CatBoost models in MLflow format via the mlflow.catboost.save_model() and mlflow.catboost.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). You can also use the mlflow.catboost.load_model() method to load MLflow Models with the catboost model flavor in native CatBoost format.

For more information, see mlflow.catboost.

Spacy(`spaCy`)

The spaCy model flavor enables logging of spaCy models in MLflow format via the mlflow.spacy.save_model() and mlflow.spacy.log_model() methods. Additionally, these methods add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.spacy.load_model() method to load MLflow Models with the spacy model flavor in native spaCy format.

For more information, see mlflow.spacy.

Fastai(`fastai`)

The fastai model flavor enables logging of fastai Learner models in MLflow format via the mlflow.fastai.save_model() and mlflow.fastai.log_model() methods. Additionally, these methods add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.fastai.load_model() method to load MLflow Models with the fastai model flavor in native fastai format.

For more information, see mlflow.fastai.

Statsmodels (`statsmodels`)

The statsmodels model flavor enables logging of Statsmodels models in MLflow format via the mlflow.statsmodels.save_model() and mlflow.statsmodels.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.statsmodels.load_model() method to load MLflow Models with the statsmodels model flavor in native statsmodels format.

As for now, automatic logging is restricted to parameters, metrics and models generated by a call to fit on a statsmodels model.

For more information, see mlflow.statsmodels.

Prophet (`prophet`)

The prophet model flavor enables logging of Prophet models in MLflow format via the mlflow.prophet.save_model() and mlflow.prophet.log_model() methods. These methods also add the python_function flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic Python functions for inference via mlflow.pyfunc.load_model(). This loaded PyFunc model can only be scored with DataFrame input. You can also use the mlflow.prophet.load_model() method to load MLflow Models with the prophet model flavor in native prophet format.

For more information, see mlflow.prophet.

Model Customization

While MLflow’s built-in model persistence utilities are convenient for packaging models from various popular ML libraries in MLflow Model format, they do not cover every use case. For example, you may want to use a model from an ML library that is not explicitly supported by MLflow’s built-in flavors. Alternatively, you may want to package custom inference code and data to create an MLflow Model. Fortunately, MLflow provides two solutions that can be used to accomplish these tasks: Custom Python Models and Custom Flavors.

In this section:

Custom Python Models
- Example: Creating a custom “add n” model
- Example: Saving an XGBoost model in MLflow format
Custom Flavors

Custom Python Models

The mlflow.pyfunc module provides save_model() and log_model() utilities for creating MLflow Models with the python_function flavor that contain user-specified code and artifact (file) dependencies. These artifact dependencies may include serialized models produced by any Python ML library.

Because these custom models contain the python_function flavor, they can be deployed to any of MLflow’s supported production environments, such as SageMaker, AzureML, or local REST endpoints.

The following examples demonstrate how you can use the mlflow.pyfunc module to create custom Python models. For additional information about model customization with MLflow’s python_function utilities, see the python_function custom models documentation.

Example: Creating a custom “add n” model

This example defines a class for a custom model that adds a specified numeric value, n, to all columns of a Pandas DataFrame input. Then, it uses the mlflow.pyfunc APIs to save an instance of this model with n = 5 in MLflow Model format. Finally, it loads the model in python_function format and uses it to evaluate a sample input.

import mlflow.pyfunc

# Define the model class
class AddN(mlflow.pyfunc.PythonModel):

    def __init__(self, n):
        self.n = n

    def predict(self, context, model_input):
        return model_input.apply(lambda column: column + self.n)

# Construct and save the model
model_path = "add_n_model"
add5_model = AddN(n=5)
mlflow.pyfunc.save_model(path=model_path, python_model=add5_model)

# Load the model in `python_function` format
loaded_model = mlflow.pyfunc.load_model(model_path)

# Evaluate the model
import pandas as pd
model_input = pd.DataFrame([range(10)])
model_output = loaded_model.predict(model_input)
assert model_output.equals(pd.DataFrame([range(5, 15)]))

Example: Saving an XGBoost model in MLflow format

This example begins by training and saving a gradient boosted tree model using the XGBoost library. Next, it defines a wrapper class around the XGBoost model that conforms to MLflow’s python_function inference API. Then, it uses the wrapper class and the saved XGBoost model to construct an MLflow Model that performs inference using the gradient boosted tree. Finally, it loads the MLflow Model in python_function format and uses it to evaluate test data.

# Load training and test datasets
from sys import version_info
import xgboost as xgb
from sklearn import datasets
from sklearn.model_selection import train_test_split

PYTHON_VERSION = "{major}.{minor}.{micro}".format(major=version_info.major,
                                                  minor=version_info.minor,
                                                  micro=version_info.micro)
iris = datasets.load_iris()
x = iris.data[:, 2:]
y = iris.target
x_train, x_test, y_train, _ = train_test_split(x, y, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(x_train, label=y_train)

# Train and save an XGBoost model
xgb_model = xgb.train(params={'max_depth': 10}, dtrain=dtrain, num_boost_round=10)
xgb_model_path = "xgb_model.pth"
xgb_model.save_model(xgb_model_path)

# Create an `artifacts` dictionary that assigns a unique name to the saved XGBoost model file.
# This dictionary will be passed to `mlflow.pyfunc.save_model`, which will copy the model file
# into the new MLflow Model's directory.
artifacts = {
    "xgb_model": xgb_model_path
}

# Define the model class
import mlflow.pyfunc
class XGBWrapper(mlflow.pyfunc.PythonModel):

    def load_context(self, context):
        import xgboost as xgb
        self.xgb_model = xgb.Booster()
        self.xgb_model.load_model(context.artifacts["xgb_model"])

    def predict(self, context, model_input):
        input_matrix = xgb.DMatrix(model_input.values)
        return self.xgb_model.predict(input_matrix)

# Create a Conda environment for the new MLflow Model that contains all necessary dependencies.
import cloudpickle
conda_env = {
    'channels': ['defaults'],
    'dependencies': [
      'python={}'.format(PYTHON_VERSION),
      'pip',
      {
        'pip': [
          'mlflow',
          'xgboost=={}'.format(xgb.__version__),
          'cloudpickle=={}'.format(cloudpickle.__version__),
        ],
      },
    ],
    'name': 'xgb_env'
}

# Save the MLflow Model
mlflow_pyfunc_model_path = "xgb_mlflow_pyfunc"
mlflow.pyfunc.save_model(
        path=mlflow_pyfunc_model_path, python_model=XGBWrapper(), artifacts=artifacts,
        conda_env=conda_env)

# Load the model in `python_function` format
loaded_model = mlflow.pyfunc.load_model(mlflow_pyfunc_model_path)

# Evaluate the model
import pandas as pd
test_predictions = loaded_model.predict(pd.DataFrame(x_test))
print(test_predictions)

Custom Flavors

You can also create custom MLflow Models by writing a custom flavor.

As discussed in the Model API and Storage Format sections, an MLflow Model is defined by a directory of files that contains an MLmodel configuration file. This MLmodel file describes various model attributes, including the flavors in which the model can be interpreted. The MLmodel file contains an entry for each flavor name; each entry is a YAML-formatted collection of flavor-specific attributes.

To create a new flavor to support a custom model, you define the set of flavor-specific attributes to include in the MLmodel configuration file, as well as the code that can interpret the contents of the model directory and the flavor’s attributes.

As an example, let’s examine the mlflow.pytorch module corresponding to MLflow’s pytorch flavor. In the mlflow.pytorch.save_model() method, a PyTorch model is saved to a specified output directory. Additionally, mlflow.pytorch.save_model() leverages the mlflow.models.Model.add_flavor() and mlflow.models.Model.save() functions to produce an MLmodel configuration containing the pytorch flavor. The resulting configuration has several flavor-specific attributes, such as pytorch_version, which denotes the version of the PyTorch library that was used to train the model. To interpret model directories produced by save_model(), the mlflow.pytorch module also defines a load_model() method. mlflow.pytorch.load_model() reads the MLmodel configuration from a specified model directory and uses the configuration attributes of the pytorch flavor to load and return a PyTorch model from its serialized representation.

Built-In Deployment Tools

MLflow provides tools for deploying MLflow models on a local machine and to several production environments. Not all deployment methods are available for all model flavors.

In this section:

Deploy MLflow models
Deploy a python_function model on Microsoft Azure ML
Deploy a python_function model on Amazon SageMaker
Export a python_function model as an Apache Spark UDF

Deploy MLflow models

MLflow can deploy models locally as local REST API endpoints or to directly score files. In addition, MLflow can package models as self-contained Docker images with the REST API endpoint. The image can be used to safely deploy the model to various environments such as Kubernetes.

You deploy MLflow model locally or generate a Docker image using the CLI interface to the mlflow.models module.

The REST API server accepts the following data formats as POST input to the /invocations path:

JSON-serialized pandas DataFrames in the split orientation. For example, data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type request header value of application/json or application/json; format=pandas-split.
JSON-serialized pandas DataFrames in the records orientation. We do not recommend using this format because it is not guaranteed to preserve column ordering. This format is specified using a Content-Type request header value of application/json; format=pandas-records.
CSV-serialized pandas DataFrames. For example, data = pandas_df.to_csv(). This format is specified using a Content-Type request header value of text/csv.
Tensor input formatted as described in TF Serving’s API docs where the provided inputs will be cast to Numpy arrays. This format is specified using a Content-Type request header value of application/json and the instances or inputs key in the request body dictionary.

If the Content-Type request header has a value of application/json, MLflow will infer whether the input format is a pandas DataFrame or TF serving (i.e tensor) input based on the data in the request body. For pandas DataFrame input, the orient can also be provided explicitly by specifying the format in the request header as shown in the record-oriented example below.

Note

Since JSON loses type information, MLflow will cast the JSON input to the input type specified in the model’s schema if available. If your model is sensitive to input types, it is recommended that a schema is provided for the model to ensure that type mismatch errors do not occur at inference time. In particular, DL models are typically strict about input types and will need model schema in order for the model to score correctly. For complex data types, see Encoding complex data below.

Example requests:

# split-oriented DataFrame input
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{
    "columns": ["a", "b", "c"],
    "data": [[1, 2, 3], [4, 5, 6]]
}'

# record-oriented DataFrame input (fine for vector rows, loses ordering for JSON records)
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 1,"b": 2,"c": 3},
    {"a": 4,"b": 5,"c": 6}
]'

# numpy/tensor input using TF serving's "instances" format
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{
    "instances": [
        {"a": "s1", "b": 1, "c": [1, 2, 3]},
        {"a": "s2", "b": 2, "c": [4, 5, 6]},
        {"a": "s3", "b": 3, "c": [7, 8, 9]}
    ]
}'

# numpy/tensor input using TF serving's "inputs" format
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json' -d '{
    "inputs": {"a": ["s1", "s2", "s3"], "b": [1, 2, 3], "c": [[1, 2, 3], [4, 5, 6], [7, 8, 9]]}
}'

For more information about serializing pandas DataFrames, see pandas.DataFrame.to_json.

For more information about serializing tensor inputs using the TF serving format, see TF serving’s request format docs.

Encoding complex data

Complex data types, such as dates or binary, do not have a native JSON representation. If you include a model signature, MLflow can automatically decode supported data types from JSON. The following data type conversions are supported:

binary: data is expected to be base64 encoded, MLflow will automatically base64 decode.
datetime: data is expected as string according to ISO 8601 specification. MLflow will parse this into the appropriate datetime representation on the given platform.

Example requests:

# record-oriented DataFrame input with binary column "b"
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 0, "b": "dGVzdCBiaW5hcnkgZGF0YSAw"},
    {"a": 1, "b": "dGVzdCBiaW5hcnkgZGF0YSAx"},
    {"a": 2, "b": "dGVzdCBiaW5hcnkgZGF0YSAy"}
]'

# record-oriented DataFrame input with datetime column "b"
curl http://127.0.0.1:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[
    {"a": 0, "b": "2020-01-01T00:00:00Z"},
    {"a": 1, "b": "2020-02-01T12:34:56Z"},
    {"a": 2, "b": "2021-03-01T00:00:00Z"}
]'

Command Line Interface

MLflow also has a CLI that supports the following commands:

serve deploys the model as a local REST API server.
build_docker packages a REST API endpoint serving the model as a docker image.
predict uses the model to generate a prediction for a local CSV or JSON file. Note that this method only supports DataFrame input.

For more info, see:

mlflow models --help
mlflow models serve --help
mlflow models predict --help
mlflow models build-docker --help

Deploy a `python_function` model on Microsoft Azure ML

The mlflow.azureml module can package python_function models into Azure ML container images and deploy them as a webservice. Models can be deployed to Azure Kubernetes Service (AKS) and the Azure Container Instances (ACI) platform for real-time serving. The resulting Azure ML ContainerImage contains a web server that accepts the following data formats as input:

JSON-serialized pandas DataFrames in the split orientation. For example, data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type request header value of application/json.
mlflow.azureml.deploy() registers an MLflow Model with an existing Azure ML workspace, builds an Azure ML container image and deploys the model to AKS and ACI. The Azure ML SDK is required in order to use this function. The Azure ML SDK requires Python 3. It cannot be installed with earlier versions of Python.

Example workflow using the Python API

import mlflow.azureml

from azureml.core import Workspace
from azureml.core.webservice import AciWebservice, Webservice


# Create or load an existing Azure ML workspace. You can also load an existing workspace using
# Workspace.get(name="<workspace_name>")
workspace_name = "<Name of your Azure ML workspace>"
subscription_id = "<Your Azure subscription ID>"
resource_group = "<Name of the Azure resource group in which to create Azure ML resources>"
location = "<Name of the Azure location (region) in which to create Azure ML resources>"
azure_workspace = Workspace.create(name=workspace_name,
                                   subscription_id=subscription_id,
                                   resource_group=resource_group,
                                   location=location,
                                   create_resource_group=True,
                                   exist_okay=True)
# Create a deployment config
aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

# Register and deploy model to Azure Container Instance (ACI)
(webservice, model) = mlflow.azureml.deploy(model_uri='<your-model-uri>',
                                            workspace=azure_workspace,
                                            model_name='mymodelname',
                                            service_name='myservice',
                                            deployment_config=aci_config)

# After the model deployment completes, requests can be posted via HTTP to the new ACI
# webservice's scoring URI. The following example posts a sample input from the wine dataset
# used in the MLflow ElasticNet example:
# https://github.com/mlflow/mlflow/tree/master/examples/sklearn_elasticnet_wine
print("Scoring URI is: %s", webservice.scoring_uri)

import requests
import json
# `sample_input` is a JSON-serialized pandas DataFrame with the `split` orientation
sample_input = {
    "columns": [
        "alcohol",
        "chlorides",
        "citric acid",
        "density",
        "fixed acidity",
        "free sulfur dioxide",
        "pH",
        "residual sugar",
        "sulphates",
        "total sulfur dioxide",
        "volatile acidity"
    ],
    "data": [
        [8.8, 0.045, 0.36, 1.001, 7, 45, 3, 20.7, 0.45, 170, 0.27]
    ]
}
response = requests.post(
              url=webservice.scoring_uri, data=json.dumps(sample_input),
              headers={"Content-type": "application/json"})
response_json = json.loads(response.text)
print(response_json)

Example workflow using the MLflow CLI

# note mlflow azureml build-image is being deprecated, it will be replaced with a new command for model deployment soon
mlflow azureml build-image -w <workspace-name> -m <model-path> -d "Wine regression model 1"

az ml service create aci -n <deployment-name> --image-id <image-name>:<image-version>

# After the image deployment completes, requests can be posted via HTTP to the new ACI
# webservice's scoring URI. The following example posts a sample input from the wine dataset
# used in the MLflow ElasticNet example:
# https://github.com/mlflow/mlflow/tree/master/examples/sklearn_elasticnet_wine

scoring_uri=$(az ml service show --name <deployment-name> -v | jq -r ".scoringUri")

# `sample_input` is a JSON-serialized pandas DataFrame with the `split` orientation
sample_input='
{
    "columns": [
        "alcohol",
        "chlorides",
        "citric acid",
        "density",
        "fixed acidity",
        "free sulfur dioxide",
        "pH",
        "residual sugar",
        "sulphates",
        "total sulfur dioxide",
        "volatile acidity"
    ],
    "data": [
        [8.8, 0.045, 0.36, 1.001, 7, 45, 3, 20.7, 0.45, 170, 0.27]
    ]
}'

echo $sample_input | curl -s -X POST $scoring_uri\
-H 'Cache-Control: no-cache'\
-H 'Content-Type: application/json'\
-d @-

For more info, see:

mlflow azureml --help
mlflow azureml build-image --help

Deploy a `python_function` model on Amazon SageMaker

The mlflow.sagemaker module can deploy python_function models locally in a Docker container with SageMaker compatible environment and remotely on SageMaker. To deploy remotely to SageMaker you need to set up your environment and user accounts. To export a custom model to SageMaker, you need a MLflow-compatible Docker image to be available on Amazon ECR. MLflow provides a default Docker image definition; however, it is up to you to build the image and upload it to ECR. MLflow includes the utility function build_and_push_container to perform this step. Once built and uploaded, you can use the MLflow container for all MLflow Models. Model webservers deployed using the mlflow.sagemaker module accept the following data formats as input, depending on the deployment flavor:

python_function: For this deployment flavor, the endpoint accepts the same formats described in the local model deployment documentation.
mleap: For this deployment flavor, the endpoint accepts only JSON-serialized pandas DataFrames in the split orientation. For example, data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type request header value of application/json.

Commands

run-local deploys the model locally in a Docker container. The image and the environment should be identical to how the model would be run remotely and it is therefore useful for testing the model prior to deployment.
build-and-push-container builds an MLfLow Docker image and uploads it to ECR. The caller must have the correct permissions set up. The image is built locally and requires Docker to be present on the machine that performs this step.
deploy deploys the model on Amazon SageMaker. MLflow uploads the Python Function model into S3 and starts an Amazon SageMaker endpoint serving the model.

Example workflow using the MLflow CLI

mlflow sagemaker build-and-push-container  - build the container (only needs to be called once)
mlflow sagemaker run-local -m <path-to-model>  - test the model locally
mlflow sagemaker deploy <parameters> - deploy the model remotely

For more info, see:

mlflow sagemaker --help
mlflow sagemaker build-and-push-container --help
mlflow sagemaker run-local --help
mlflow sagemaker deploy --help

Export a `python_function` model as an Apache Spark UDF

You can output a python_function model as an Apache Spark UDF, which can be uploaded to a Spark cluster and used to score the model.

Example

from pyspark.sql.functions import struct

pyfunc_udf = mlflow.pyfunc.spark_udf(<path-to-model>)
df = spark_df.withColumn("prediction", pyfunc_udf(struct(<feature-names>)))

If a model contains a signature, the UDF can be called without specifying column name arguments. In this case, the UDF will be called with column names from signature, so the evaluation dataframe’s column names must match the model signature’s column names.

Example

pyfunc_udf = mlflow.pyfunc.spark_udf(<path-to-model-with-signature>)
df = spark_df.withColumn("prediction", pyfunc_udf())

The resulting UDF is based on Spark’s Pandas UDF and is currently limited to producing either a single value or an array of values of the same type per observation. By default, we return the first numeric column as a double. You can control what result is returned by supplying result_type argument. The following values are supported:

'int' or IntegerType: The leftmost integer that can fit in int32 result is returned or exception is raised if there is none.
'long' or LongType: The leftmost long integer that can fit in int64 result is returned or exception is raised if there is none.
ArrayType (IntegerType | LongType): Return all integer columns that can fit into the requested size.
'float' or FloatType: The leftmost numeric result cast to float32 is returned or exception is raised if there is no numeric column.
'double' or DoubleType: The leftmost numeric result cast to double is returned or exception is raised if there is no numeric column.
ArrayType ( FloatType | DoubleType ): Return all numeric columns cast to the requested. type. Exception is raised if there are numeric columns.
'string' or StringType: Result is the leftmost column converted to string.
ArrayType ( StringType ): Return all columns converted to string.

Example

from pyspark.sql.types import ArrayType, FloatType
from pyspark.sql.functions import struct

pyfunc_udf = mlflow.pyfunc.spark_udf("path/to/model", result_type=ArrayType(FloatType()))
# The prediction column will contain all the numeric columns returned by the model as floats
df = spark_df.withColumn("prediction", pyfunc_udf(struct("name", "age")))

Deployment to Custom Targets

In addition to the built-in deployment tools, MLflow provides a pluggable mlflow.deployments Python API and mlflow deployments CLI for deploying models to custom targets and environments. To deploy to a custom target, you must first install an appropriate third-party Python plugin. See the list of known community-maintained plugins here.

Note

APIs for deployment to custom targets are experimental, and may be altered in a future release.

Commands

The mlflow deployments CLI contains the following commands, which can also be invoked programmatically using the mlflow.deployments Python API:

Create: Deploy an MLflow model to a specified custom target
Delete: Delete a deployment
Update: Update an existing deployment, for example to deploy a new model version or change the deployment’s configuration (e.g. increase replica count)
List: List IDs of all deployments
Get: Print a detailed description of a particular deployment
Run Local: Deploy the model locally for testing
Help: Show the help string for the specified target

For more info, see:

mlflow deployments --help
mlflow deployments create --help
mlflow deployments delete --help
mlflow deployments update --help
mlflow deployments list --help
mlflow deployments get --help
mlflow deployments run-local --help
mlflow deployments help --help

Community Model Flavors

MLflow VizMod

The mlflow-vizmod project allows data scientists to be more productive with their visualizations. We treat visualizations as models - just like ML models - thus being able to use the same infrastructure as MLflow to track, create projects, register, and deploy visualizations.

Installation:

pip install mlflow-vizmod

Example:

from sklearn.datasets import load_iris
import altair as alt
import mlflow_vismod

df_iris = load_iris(as_frame=True)

viz_iris = (
    alt.Chart(df_iris)
      .mark_circle(size=60)
      .encode(x="x", y="y", color="z:N")
      .properties(height=375, width=575)
      .interactive()
)

mlflow_vismod.log_model(
    model=viz_iris,
    artifact_path="viz",
    style="vegalite",
    input_example=df_iris.head(5),
)