mlflow.models

The mlflow.models module provides an API for saving machine learning models in “flavors” that can be understood by different downstream tools.

The built-in flavors are:

For details, see MLflow Models.

class mlflow.models.EvaluationArtifact(uri, content=None)[source]

Bases: object

A model evaluation artifact containing an artifact uri and content.

property content

The content of the artifact (representation varies)

property uri

The URI of the artifact

class mlflow.models.EvaluationResult(metrics, artifacts)[source]

Bases: object

Represents the model evaluation outputs of a mlflow.evaluate() API call, containing both scalar metrics and output artifacts such as performance plots.

property artifacts

A dictionary mapping standardized artifact names (e.g. “roc_data”) to artifact content and location information

classmethod load(path)[source]

Load the evaluation results from the specified local filesystem path

property metrics

A dictionary mapping scalar metric names to scalar metric values

save(path)[source]

Write the evaluation results to the specified local filesystem path

class mlflow.models.FlavorBackend(config, **kwargs)[source]

Bases: object

Abstract class for Flavor Backend. This class defines the API interface for local model deployment of MLflow model flavors.

can_build_image()[source]
Returns

True if this flavor has a build_image method defined for building a docker container capable of serving the model, False otherwise.

abstract can_score_model()[source]

Check whether this flavor backend can be deployed in the current environment.

Returns

True if this flavor backend can be applied in the current environment.

abstract predict(model_uri, input_path, output_path, content_type, json_format)[source]

Generate predictions using a saved MLflow model referenced by the given URI. Input and output are read from and written to a file or stdin / stdout.

Parameters
  • model_uri – URI pointing to the MLflow model to be used for scoring.

  • input_path – Path to the file with input data. If not specified, data is read from stdin.

  • output_path – Path to the file with output predictions. If not specified, data is written to stdout.

  • content_type – Specifies the input format. Can be one of {json, csv}

  • json_format – Only applies if content_type == json. Specifies how is the input data encoded in json. Can be one of {split, records} mirroring the behavior of Pandas orient attribute. The default is split which expects dict like data: {'index' -> [index], 'columns' -> [columns], 'data' -> [values]}, where index is optional. For more information see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_json.html

prepare_env(model_uri)[source]

Performs any preparation necessary to predict or serve the model, for example downloading dependencies or initializing a conda environment. After preparation, calling predict or serve should be fast.

abstract serve(model_uri, port, host, enable_mlserver)[source]

Serve the specified MLflow model locally.

Parameters
  • model_uri – URI pointing to the MLflow model to be used for scoring.

  • port – Port to use for the model deployment.

  • host – Host to use for the model deployment. Defaults to localhost.

  • enable_mlserver – Whether to use MLServer or the local scoring server.

class mlflow.models.Model(artifact_path=None, run_id=None, utc_time_created=None, flavors=None, signature=None, saved_input_example_info: Optional[Dict[str, Any]] = None, model_uuid: Optional[Union[str, Callable]] = <function Model.<lambda>>, **kwargs)[source]

Bases: object

An MLflow Model that can support multiple model flavors. Provides APIs for implementing new Model flavors.

add_flavor(name, **params)[source]

Add an entry for how to serve the model in a given format.

classmethod from_dict(model_dict)[source]

Load a model from its YAML representation.

get_input_schema()[source]
get_model_info()[source]

Create a ModelInfo instance that contains the model metadata.

get_output_schema()[source]
classmethod load(path)[source]

Load a model from its YAML representation.

load_input_example(path: str)[source]

Load the input example saved along a model. Returns None if there is no example metadata (i.e. the model was saved without example). Raises FileNotFoundError if there is model metadata but the example file is missing.

Parameters

path – Path to the model directory.

Returns

Input example (NumPy ndarray, SciPy csc_matrix, SciPy csr_matrix, pandas DataFrame, dict) or None if the model has no example.

classmethod log(artifact_path, flavor, registered_model_name=None, await_registration_for=300, **kwargs)[source]

Log model using supplied flavor module. If no run is active, this method will create a new active run.

Parameters
  • artifact_path – Run relative path identifying the model.

  • flavor – Flavor module to save the model with. The module must have the save_model function that will persist the model as a valid MLflow model.

  • registered_model_name – If given, create a model version under registered_model_name, also creating a registered model if one with the given name does not exist.

  • signature

    ModelSignature describes model input and output Schema. The model signature can be inferred from datasets representing valid model input (e.g. the training dataset) and valid model output (e.g. model predictions generated on the training dataset), for example:

    from mlflow.models.signature import infer_signature
    train = df.drop_column("target_label")
    signature = infer_signature(train, model.predict(train))
    

  • input_example – Input example provides one or several examples of valid model input. The example can be used as a hint of what data to feed the model. The given example will be converted to a Pandas DataFrame and then serialized to json using the Pandas split-oriented format. Bytes are base64-encoded.

  • await_registration_for – Number of seconds to wait for the model version to finish being created and is in READY status. By default, the function waits for five minutes. Specify 0 or None to skip waiting.

  • kwargs – Extra args passed to the model flavor.

Returns

A ModelInfo instance that contains the metadata of the logged model.

save(path)[source]

Write the model as a local YAML file.

property saved_input_example_info

A dictionary that contains the metadata of the saved input example, e.g., {"artifact_path": "input_example.json", "type": "dataframe", "pandas_orient": "split"}.

property signature
to_dict()[source]

Serialize the model to a dictionary.

to_json()[source]

Write the model as json.

to_yaml(stream=None)[source]

Write the model as yaml string.

class mlflow.models.ModelSignature(inputs: mlflow.types.schema.Schema, outputs: Optional[mlflow.types.schema.Schema] = None)[source]

Bases: object

ModelSignature specifies schema of model’s inputs and outputs.

ModelSignature can be inferred from training dataset and model predictions using or constructed by hand by passing an input and output Schema.

classmethod from_dict(signature_dict: Dict[str, Any])[source]

Deserialize from dictionary representation.

Parameters

signature_dict – Dictionary representation of model signature. Expected dictionary format: {‘inputs’: <json string>, ‘outputs’: <json string>” }

Returns

ModelSignature populated with the data form the dictionary.

to_dict()Dict[str, Any][source]

Serialize into a ‘jsonable’ dictionary.

Input and output schema are represented as json strings. This is so that the representation is compact when embedded in a MLmofel yaml file.

Returns

dictionary representation with input and output shcema represented as json strings.

mlflow.models.evaluate(model: Union[str, mlflow.pyfunc.PyFuncModel], data, *, targets, model_type: str, dataset_name=None, dataset_path=None, feature_names: Optional[list] = None, evaluators=None, evaluator_config=None, custom_metrics=None)[source]

Note

Experimental: This method may change or be removed in a future release without warning.

Evaluate a PyFunc model on the specified dataset using one or more specified evaluators, and log resulting metrics & artifacts to MLflow Tracking. For additional overview information, see the Model Evaluation documentation.

Default Evaluator behavior:
  • The default evaluator, which can be invoked with evaluators="default" or evaluators=None, supports the "regressor" and "classifier" model types. It generates a variety of model performance metrics, model performance plots, and model explanations.

  • For both the "regressor" and "classifier" model types, the default evaluator generates model summary plots and feature importance plots using SHAP.

  • For regressor models, the default evaluator additionally logs:
    • metrics: example_count, mean_absolute_error, mean_squared_error, root_mean_squared_error, sum_on_label, mean_on_label, r2_score, max_error, mean_absolute_percentage_error.

  • For binary classifiers, the default evaluator additionally logs:
    • metrics: true_negatives, false_positives, false_negatives, true_positives, recall, precision, f1_score, accuracy, example_count, log_loss, roc_auc, precision_recall_auc.

    • artifacts: lift curve plot, precision-recall plot, ROC plot.

  • For multiclass classifiers, the default evaluator additionally logs:
    • metrics: accuracy, example_count, f1_score_micro, f1_score_macro, log_loss

    • artifacts: A CSV file for “per_class_metrics” (per-class metrics includes true_negatives/false_positives/false_negatives/true_positives/recall/precision/roc_auc, precision_recall_auc), precision-recall merged curves plot, ROC merged curves plot.

  • The logged MLflow metric keys are constructed using the format: {metric_name}_on_{dataset_name}. Any preexisting metrics with the same name are overwritten.

  • The metrics/artifacts listed above are logged to the active MLflow run. If no active run exists, a new MLflow run is created for logging these metrics and artifacts.

  • Additionally, information about the specified dataset - hash, name (if specified), path (if specified), and the UUID of the model that evaluated it - is logged to the mlflow.datasets tag.

  • The available evaluator_config options for the default evaluator include:
    • log_model_explainability: A boolean value specifying whether or not to log model explainability insights, default value is True.

    • explainability_algorithm: A string to specify the SHAP Explainer algorithm for model explainability. Supported algorithm includes: ‘exact’, ‘permutation’, ‘partition’. If not set, shap.Explainer is used with the “auto” algorithm, which chooses the best Explainer based on the model.

    • explainability_nsamples: The number of sample rows to use for computing model explainability insights. Default value is 2000.

    • max_classes_for_multiclass_roc_pr: For multiclass classification tasks, the maximum number of classes for which to log the per-class ROC curve and Precision-Recall curve. If the number of classes is larger than the configured maximum, these curves are not logged.

  • Limitations of evaluation dataset:
    • For classification tasks, dataset labels are used to infer the total number of classes.

    • For binary classification tasks, the negative label value must be 0 or -1 or False, and the positive label value must be 1 or True.

  • Limitations of metrics/artifacts computation:
    • For classification tasks, some metric and artifact computations require the model to output class probabilities. Currently, for scikit-learn models, the default evaluator calls the predict_proba method on the underlying model to obtain probabilities. For other model types, the default evaluator does not compute metrics/artifacts that require probability outputs.

  • Limitations of default evaluator logging model explainability insights:
    • The shap.Explainer auto algorithm uses the Linear explainer for linear models and the Tree explainer for tree models. Because SHAP’s Linear and Tree explainers do not support multi-class classification, the default evaluator falls back to using the Exact or Permutation explainers for multi-class classification tasks.

    • Logging model explainability insights is not currently supported for PySpark models.

    • The evaluation dataset label values must be numeric or boolean, all feature values must be numeric, and each feature column must only contain scalar values.

Parameters
  • model – A pyfunc model instance, or a URI referring to such a model.

  • data

    One of the following:

    • A numpy array or list of evaluation features, excluding labels.

    • A Pandas DataFrame or Spark DataFrame, containing evaluation features and labels. If feature_names argument not specified, all columns are regarded as feature columns. Otherwise, only column names present in feature_names are regarded as feature columns.

  • targets – If data is a numpy array or list, a numpy array or list of evaluation labels. If data is a DataFrame, the string name of a column from data that contains evaluation labels.

  • model_type – A string describing the model type. The default evaluator supports "regressor" and "classifier" as model types.

  • dataset_name – (Optional) The name of the dataset, must not contain double quotes (). The name is logged to the mlflow.datasets tag for lineage tracking purposes. If not specified, the dataset hash is used as the dataset name.

  • dataset_path – (Optional) The path where the data is stored. Must not contain double quotes (). If specified, the path is logged to the mlflow.datasets tag for lineage tracking purposes.

  • feature_names – (Optional) If the data argument is a feature data numpy array or list, feature_names is a list of the feature names for each feature. If None, then the feature_names are generated using the format feature_{feature_index}. If the data argument is a Pandas DataFrame or a Spark DataFrame, feature_names is a list of the names of the feature columns in the DataFrame. If None, then all columns except the label column are regarded as feature columns.

  • evaluators – The name of the evaluator to use for model evaluation, or a list of evaluator names. If unspecified, all evaluators capable of evaluating the specified model on the specified dataset are used. The default evaluator can be referred to by the name "default". To see all available evaluators, call mlflow.models.list_evaluators().

  • evaluator_config – A dictionary of additional configurations to supply to the evaluator. If multiple evaluators are specified, each configuration should be supplied as a nested dictionary whose key is the evaluator name.

  • custom_metrics

    (Optional) A list of custom metric functions. A custom metric function is required to take in two parameters:

    • Union[pandas.Dataframe, pyspark.sql.DataFrame]: The first being a Pandas or Spark DataFrame containing prediction and target column. The prediction column contains the predictions made by the model. The target column contains the corresponding labels to the predictions made on that row.

    • Dict: The second is a dictionary containing the metrics calculated by the default evaluator. The keys are the names of the metrics and the values are the scalar values of the metrics. Refer to the DefaultEvaluator behavior section for what metrics will be returned based on the type of model (i.e. classifier or regressor).

    A custom metric function can return in the following format:

    • Dict[AnyStr, Union[int, float, np.number]: a singular dictionary of custom metrics, where the keys are the names of the metrics, and the values are the scalar values of the metrics.

    Custom Metric Function Boilerplate
    def custom_metrics_boilerplate(eval_df, builtin_metrics):
        # ...
        metrics: Dict[AnyStr, Union[int, float, np.number]] = some_dict
        # ...
        return metrics
    
    Example usage of custom metrics
    def squared_diff_plus_one(eval_df, builtin_metrics):
      return {
          "squared_diff_plus_one": (
              np.sum(
                  np.abs(
                      eval_df["prediction"] - eval_df["target"] + 1
                  ) ** 2
              )
          )
      }
    
    with mlflow.start_run():
        mlflow.evaluate(
            model,
            X,
            targets,
            custom_metrics=[squared_diff_plus_one, ...],
        )
    

Returns

An mlflow.models.EvaluationResult instance containing evaluation results.

mlflow.models.infer_pip_requirements(model_uri, flavor, fallback=None)[source]

Infers the pip requirements of the specified model by creating a subprocess and loading the model in it to determine which packages are imported.

Parameters
  • model_uri – The URI of the model.

  • flavor – The flavor name of the model.

  • fallback – If provided, an unexpected error during the inference procedure is swallowed and the value of fallback is returned. Otherwise, the error is raised.

Returns

A list of inferred pip requirements (e.g. ["scikit-learn==0.24.2", ...]).

mlflow.models.infer_signature(model_input: Any, model_output: MlflowInferableDataset = None)mlflow.models.signature.ModelSignature[source]

Infer an MLflow model signature from the training data (input) and model predictions (output).

The signature represents model input and output as data frames with (optionally) named columns and data type specified as one of types defined in mlflow.types.DataType. This method will raise an exception if the user data contains incompatible types or is not passed in one of the supported formats listed below.

The input should be one of these:
  • pandas.DataFrame

  • dictionary of { name -> numpy.ndarray}

  • numpy.ndarray

  • pyspark.sql.DataFrame

The element types should be mappable to one of mlflow.types.DataType.

For pyspark.sql.DataFrame inputs, columns of type DateType and TimestampType are both inferred as type datetime, which is coerced to TimestampType at inference.

NOTE: Multidimensional (>2d) arrays (aka tensors) are not supported at this time.

Parameters
  • model_input – Valid input to the model. E.g. (a subset of) the training dataset.

  • model_output – Valid model output. E.g. Model predictions for the (subset of) training dataset.

Returns

ModelSignature

mlflow.models.list_evaluators()[source]

Return a name list for all available Evaluators.

class mlflow.models.model.ModelInfo(artifact_path: str, flavors: Dict[str, Any], model_uri: str, model_uuid: str, run_id: str, saved_input_example_info: Optional[Dict[str, Any]], signature_dict: Optional[Dict[str, Any]], utc_time_created: str)[source]

The metadata of a logged MLflow Model.

property artifact_path

Run relative path identifying the logged model.

property flavors

A dictionary mapping the flavor name to how to serve the model as that flavor. For example:

{
    "python_function": {
        "model_path": "model.pkl",
        "loader_module": "mlflow.sklearn",
        "python_version": "3.8.10",
        "env": "conda.yaml",
    },
    "sklearn": {
        "pickled_model": "model.pkl",
        "sklearn_version": "0.24.1",
        "serialization_format": "cloudpickle",
    },
}
property model_uri

The model_uri of the logged model in the format 'runs:/<run_id>/<artifact_path>'.

property model_uuid

The model_uuid of the logged model, e.g., '39ca11813cfc46b09ab83972740b80ca'.

property run_id

The run_id associated with the logged model, e.g., '8ede7df408dd42ed9fc39019ef7df309'

property saved_input_example_info

A dictionary that contains the metadata of the saved input example, e.g., {"artifact_path": "input_example.json", "type": "dataframe", "pandas_orient": "split"}.

property signature_dict

A dictionary that describes the model input and output generated by ModelSignature.to_dict().

property utc_time_created

The UTC time that the logged model is created, e.g., '2022-01-12 05:17:31.634689'.