Managing Dependencies in MLflow Models

MLflow Model is a standard format that packages a machine learning model with its dependencies and other metadata. Building a model with its dependencies allows for reproducibility and portability across a variety of platforms and tools.

When you create an MLflow model using the MLflow Tracking APIs, for instance, mlflow.pytorch.log_model(), MLflow automatically infers the required dependencies for the model flavor you’re using and records them as a part of Model metadata. Then, when you serve the model for prediction, MLflow automatically installs the dependencies to the environment. Therefore, you normally won’t need to worry about managing dependencies in MLflow Model.

However, in some cases, you may need to add or modify some dependencies. This page provides a high-level description of how MLflow manages dependencies and guidance for how to customize dependencies for your use case.

Tip

One tip for improving MLflow’s dependency inference accuracy is to add an input_example when saving your model. This enables MLflow to perform a model prediction before saving the model, thereby capturing the dependencies used during the prediction. Please refer to Model Input Example for additional, detailed usage of this parameter.

How MLflow Records Model Dependencies

An MLflow Model is saved within a specified directory with the following structure:

my_model/
├── MLmodel
├── model.pkl
├── conda.yaml
├── python_env.yaml
└── requirements.txt

Model dependencies are defined by the following files (For other files, please refer to the guidance provided in the section discussing Storage Format):

  • python_env.yaml - This file contains the information required to restore the model environment using virtualenv (1) python version (2) build tools like pip, setuptools, and wheel (3) pip requirements of the model (a reference to requirements.txt)

  • requirements.txt - Defines the set of pip dependencies required to run the model.

  • conda.yaml - Defines the conda environment required to run the model. This is used when you specify conda as the environment manager for restoring the model environment.

Please note that it is not recommended to edit these files manually to add or remove dependencies. They are automatically generated by MLflow and any change you make manually will be overwritten when you save the model again. Instead, you should use one of the recommended methods described in the following sections.

Example

The following shows an example of environment files generated by MLflow when logging a model with mlflow.sklearn.log_model:

  • python_env.yaml

    python: 3.9.8
    build_dependencies:
    - pip==23.3.2
    - setuptools==69.0.3
    - wheel==0.42.0
    dependencies:
    - -r requirements.txt
    
  • requirements.txt

    mlflow==2.9.2
    scikit-learn==1.3.2
    cloudpickle==3.0.0
    
  • conda.yaml

    name: mlflow-env
    channels:
      - conda-forge
    dependencies:
    - python=3.9.8
    - pip
    - pip:
      - mlflow==2.9.2
      - scikit-learn==1.3.2
      - cloudpickle==3.0.0
    

Adding Extra Dependencies to an MLflow Model

MLflow infers dependencies required for the model flavor library, but your model may depend on other libraries e.g. data preprocessing. In this case, you can add extra dependencies to the model by specifying the extra_pip_requirements param when logging the model. For example,

import mlflow


class CustomModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input):
        # your model depends on pandas
        import pandas as pd

        ...
        return prediction


# Log the model
with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        python_model=CustomModel(),
        artifact_path="model",
        extra_pip_requirements=["pandas==2.0.3"],
        input_example=input_data,
    )

The extra dependencies will be added to requirements.txt as follows (and similarly to conda.yaml):

mlflow==2.9.2
cloudpickle==3.0.0
pandas==2.0.3  # added

In this case, MLflow will install Pandas 2.0.3 in addition to the inferred dependencies when serving the model for prediction.

Note

Once you log the model with dependencies, it is advisable to test it in a sandbox environment to avoid any dependency issues when deploying the model to production. Since MLflow 2.10.0, you can use the mlflow.models.predict() API to quickly test your model in a virtual environment. Please refer to Validating Environment for Prediction for more details.

Defining All Dependencies by Yourself

Alternatively, you can also define all dependencies from scratch rather than adding extra ones. To do so, specify pip_requirements when logging the model. For example,

import mlflow

# Log the model
with mlflow.start_run() as run:
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        pip_requirements=[
            "mlflow-skinny==2.9.2",
            "cloudpickle==2.5.8",
            "scikit-learn==1.3.1",
        ],
    )

The manually defined dependencies will override the default ones MLflow detects from the model flavor library:

mlflow-skinny==2.9.2
cloudpickle==2.5.8
scikit-learn==1.3.1

Warning

Please be careful when declaring dependencies that are different from those used during training, as it can be dangerous and prone to unexpected behavior. The safest way to ensure consistency is to rely on the default dependencies inferred by MLflow.

Note

Once you log the model with dependencies, it is advisable to test it in a sandbox environment to avoid any dependency issues when deploying the model to production. Since MLflow 2.10.0, you can use the mlflow.models.predict() API to quickly test your model in a virtual environment. Please refer to Validating Environment for Prediction for more details.

Saving Extra Code with an MLflow Model

MLflow also supports saving your custom Python code as dependencies to the model. This is particularly useful when you want to deploy your custom modules that are required for prediction with the model. To do so, specify code_path when logging the model. For example, if you have the following file structure in your project:

my_project/
├── utils.py
└── train.py
train.py
import mlflow


class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input):
        from utils import my_func

        x = my_func(model_input)
        # .. your prediction logic
        return prediction


# Log the model
with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        python_model=MyModel(),
        artifact_path="model",
        input_example=input_data,
        code_paths=["utils.py"],
    )

Then MLflow will save utils.py under code/ directory in the model directory:

model/
├── MLmodel
├── ...
└── code/
    └── utils.py

When MLflow loads the model for serving, the code directory will be added to the system path so that you can use the module in your model code like from utils import my_func. You can also specify a directory path as code_path to save multiple files under the directory:

Caveats of code_path Option

When using the code_path option, please be aware of the limitation that the specified file or directory must be in the same directory as your model script. If the specified file or directory is in a parent or child directory like my_project/src/utils.py, model serving will fail with ModuleNotFoundError. For example, let’s say that you have the following file structure in your project

my_project/
|── train.py
└── src/
    └──  utils.py

Then the following model code does not work:

class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input):
        from src.utils import my_func

        # .. your prediction logic
        return prediction


with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        python_model=MyModel(),
        artifact_path="model",
        input_example=input_data,
        code_paths=[
            "src/utils.py"
        ],  # the file will be saved at code/utils.py not code/src/utils.py
    )

# => Model serving will fail with ModuleNotFoundError: No module named 'src'

This limitation is due to how MLflow saves and loads the specified files and directories. When it copies the specified files or directories in code/ target, it does not preserve the relative paths that they were originally residing within. For instance, in the above example, MLflow will copy utils.py to code/utils.py, not code/src/utils.py. As a result, it has to be imported as from utils import my_func, instead of from src.utils import my_func. However, this may not be pleasant, as the import path is different from the original training script.

To workaround this issue, the code_path should specify the parent directory, which is code_path=["src"] in this example. This way, MLflow will copy the entire src/ directory under code/ and your model code will be able to import src.utils.

class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input):
        from src.utils import my_func

        # .. your prediction logic
        return prediction


with mlflow.start_run() as run:
    mlflow.pyfunc.log_model(
        python_model=model,
        artifact_path="model",
        input_example=input_data,
        code_paths=["src"],  # the whole /src directory will be saved at code/src
    )

# => This will work

Warning

By the same reason, code_path option doesn’t handle the relative import like code_path=["../src"].

Validating Environment for Prediction

Validating your model before deployment is a critical step to ensure production readiness. MLflow provides a few ways to test your model locally, either in a virtual environment or a Docker container. If you find any dependency issues during validation, please follow the guidance in How to fix dependency errors when serving my model?

Testing offline prediction with a virtual environment

You can use MLflow Models predict API via Python or CLI to make test predictions with your model. This will load your model from the model URI, create a virtual environment with the model dependencies (defined in MLflow Model), and run offline predictions with the model. Please refer to mlflow.models.predict() or the CLI reference for more detailed usage for the predict API.

Note

The Python API is available since MLflow 2.10.0. If you are using an older version, please use the CLI option.

import mlflow

mlflow.models.predict(
    model_uri="runs:/<run_id>/model",
    input_data=<input_data>,
)

Using the mlflow.models.predict() API is convenient for testing your model and inference environment quickly. However, it may not be a perfect simulation of the serving because it does not start the online inference server.

Testing online inference endpoint with a virtual environment

If you want to test your model by actually running the online inference server, you can use the MLflow serve API. This will create a virtual environment with your model and dependencies, similarly to the predict API, but will start the inference server and expose the REST endpoints. Then you can send a test request and validate the response. Please refer to the CLI reference for more detailed usage for the serve API.

mlflow models serve -m runs:/<run_id>/model -p <port>
# In another terminal
curl -X POST -H "Content-Type: application/json" \
    --data '{"inputs": [[1, 2], [3, 4]]}' \
    http://localhost:<port>/invocations

While this is a reliable way to test your model before deployment, one caveat is that the virtual environment doesn’t absorb the OS-level differences between your machine and the production environment. For example, if you are using MacOS as a local dev machine but your deployment target is running on Linux, you may encounter some issues that are not reproducible in the virtual environment.

In this case, you can use a Docker container to test your model. While it doesn’t provide full OS-level isolation unlike virtual machines e.g. we can’t run Windows containers on Linux machines, Docker covers some popular test scenarios such as running different versions of Linux or simulating Linux environments on Mac or Windows.

Testing online inference endpoint with a Docker container

MLflow build-docker API for CLI and Python is capable of building an Ubuntu-based Docker image for serving your model. The image will contain your model and dependencies, as well as having an entrypoint that is used to start the inference server. Similarly to the serve API, you can send a test request and validate the response. Please refer to the CLI reference for more detailed usage for the build-docker API.

mlflow models build-docker -m runs:/<run_id>/model -n <image_name>
docker run -p <port>:8080 <image_name>
# In another terminal
curl -X POST -H "Content-Type: application/json" \
    --data '{"inputs": [[1, 2], [3, 4]]}' \
    http://localhost:<port>/invocations

Troubleshooting

How to fix dependency errors when serving my model?

One of the most common issues experienced during model deployment centers around dependency issues. When logging or saving your model, MLflow tries to infer the model dependencies and save them as part of the MLflow Model metadata. However, this might not always be complete and miss some dependencies e.g. [extras] dependencies for certain libraries. This can cause errors when serving your model, such as “ModuleNotFoundError” or “ImportError”. Below are some steps that can help to diagnose and fix missing dependency errors.

Hint

To reduce the possibility of dependency errors, you can add input_example when saving your model. This enables MLflow to perform a model prediction before saving the model, thereby capturing the dependencies used during the prediction. Please refer to Model Input Example for additional, detailed usage of this parameter.

1. Check the missing dependencies

The missing dependencies are listed in the error message. For example, if you see the following error message:

ModuleNotFoundError: No module named 'cv2'

2. Try adding the dependencies using the predict API

Now that you know the missing dependencies, you can create a new model version with the correct dependencies. However, creating a new model for trying new dependencies might be a bit tedious, particularly because you may need to iterate multiple times to find the correct solution. Instead, you can use the mlflow.models.predict() API to test your change without actually needing to re-log the model repeatedly while troubleshooting the installation errors.

To do so, use the pip-requirements-override option to specify pip dependencies like opencv-python==4.8.0.

import mlflow

mlflow.models.predict(
    model_uri="runs:/<run_id>/model",
    input_data=<input_data>,
    pip_requirements_override=["opencv-python==4.8.0"],
)

The specified dependencies will be installed to the virtual environment in addition to (or instead of) the dependencies defined in the model metadata. Since this doesn’t mutate the model, you can iterate quickly and safely to find the correct dependencies.

Note

The pip-requirements-override option is available since MLflow 2.10.0.

3. Update the model metadata

Once you find the correct dependencies, you can create a new model with the correct dependencies. To do so, specify the extra_pip_requirements option when logging the model.

import mlflow

mlflow.pyfunc.log_model(
    artifact_path="model",
    python_model=python_model,
    extra_pip_requirements=["opencv-python==4.8.0"],
    input_example=input_data,
)

How to migrate Anaconda Dependency for License Change

Anaconda Inc. updated their terms of service for anaconda.org channels. Based on the new terms of service you may require a commercial license if you rely on Anaconda’s packaging and distribution. See Anaconda Commercial Edition FAQ for more information. Your use of any Anaconda channels is governed by their terms of service.

MLflow models logged before v1.18 were by default logged with the conda defaults channel (https://repo.anaconda.com/pkgs/) as a dependency. Because of this license change, MLflow has stopped the use of the defaults channel for models logged using MLflow v1.18 and above. The default channel logged is now conda-forge, which points at the community managed https://conda-forge.org/.

If you logged a model before MLflow v1.18 without excluding the defaults channel from the conda environment for the model, that model may have a dependency on the defaults channel that you may not have intended. To manually confirm whether a model has this dependency, you can examine the channel value in the conda.yaml file that is packaged with the logged model. For example, a model’s conda.yaml with a defaults channel dependency may look like this:

name: mlflow-env
channels:
- defaults
dependencies:
- python=3.8.8
- pip
- pip:
    - mlflow==2.3
    - scikit-learn==0.23.2
    - cloudpickle==1.6.0

If you would like to change the channel used in a model’s environment, you can re-register the model to the model registry with a new conda.yaml. You can do this by specifying the channel in the conda_env parameter of log_model().

For more information on the log_model() API, see the MLflow documentation for the model flavor you are working with, for example, mlflow.sklearn.log_model().