mlflow.deployments
Exposes functionality for deploying MLflow models to custom serving tools.
Note: model deployment to AWS Sagemaker can currently be performed via the
mlflow.sagemaker
module. Model deployment to Azure can be performed by using the
azureml library.
MLflow does not currently provide built-in support for any other deployment targets, but support for custom targets can be installed via third-party plugins. See a list of known plugins here.
This page largely focuses on the user-facing deployment APIs. For instructions on implementing your own plugin for deployment to a custom serving tool, see plugin docs.
-
class
mlflow.deployments.
BaseDeploymentClient
(target_uri)[source] Base class exposing Python model deployment APIs.
Plugin implementors should define target-specific deployment logic via a subclass of
BaseDeploymentClient
within the plugin module, and customize method docstrings with target-specific information.Note
Subclasses should raise
mlflow.exceptions.MlflowException
in error cases (e.g. on failure to deploy a model).-
abstract
create_deployment
(name, model_uri, flavor=None, config=None, endpoint=None)[source] Deploy a model to the specified target. By default, this method should block until deployment completes (i.e. until it’s possible to perform inference with the deployment). In the case of conflicts (e.g. if it’s not possible to create the specified deployment without due to conflict with an existing deployment), raises a
mlflow.exceptions.MlflowException
. See target-specific plugin documentation for additional detail on support for asynchronous deployment and other configuration.- Parameters
name – Unique name to use for deployment. If another deployment exists with the same name, raises a
mlflow.exceptions.MlflowException
model_uri – URI of model to deploy
flavor – (optional) Model flavor to deploy. If unspecified, a default flavor will be chosen.
config – (optional) Dict containing updated target-specific configuration for the deployment
endpoint – (optional) Endpoint to create the deployment under. May not be supported by all targets
- Returns
Dict corresponding to created deployment, which must contain the ‘name’ key.
-
create_endpoint
(name, config=None)[source] Create an endpoint with the specified target. By default, this method should block until creation completes (i.e. until it’s possible to create a deployment within the endpoint). In the case of conflicts (e.g. if it’s not possible to create the specified endpoint due to conflict with an existing endpoint), raises a
mlflow.exceptions.MlflowException
. See target-specific plugin documentation for additional detail on support for asynchronous creation and other configuration.- Parameters
name – Unique name to use for endpoint. If another endpoint exists with the same name, raises a
mlflow.exceptions.MlflowException
.config – (optional) Dict containing target-specific configuration for the endpoint.
- Returns
Dict corresponding to created endpoint, which must contain the ‘name’ key.
-
abstract
delete_deployment
(name, config=None, endpoint=None)[source] Delete the deployment with name
name
from the specified target. Deletion should be idempotent (i.e. deletion should not fail if retried on a non-existent deployment).- Parameters
name – Name of deployment to delete
config – (optional) dict containing updated target-specific configuration for the deployment
endpoint – (optional) Endpoint containing the deployment to delete. May not be supported by all targets
- Returns
None
-
delete_endpoint
(endpoint)[source] Delete the endpoint from the specified target. Deletion should be idempotent (i.e. deletion should not fail if retried on a non-existent deployment).
- Parameters
endpoint – Name of endpoint to delete
- Returns
None
-
explain
(deployment_name=None, df=None, endpoint=None)[source] Generate explanations of model predictions on the specified input pandas Dataframe
df
for the deployed model. Explanation output formats vary by deployment target, and can include details like feature importance for understanding/debugging predictions.- Parameters
deployment_name – Name of deployment to predict against
df – Pandas DataFrame to use for explaining feature importance in model prediction
endpoint – Endpoint to predict against. May not be supported by all targets
- Returns
A JSON-able object (pandas dataframe, numpy array, dictionary), or an exception if the implementation is not available in deployment target’s class
-
abstract
get_deployment
(name, endpoint=None)[source] Returns a dictionary describing the specified deployment, throwing a
mlflow.exceptions.MlflowException
if no deployment exists with the provided ID. The dict is guaranteed to contain an ‘name’ key containing the deployment name. The other fields of the returned dictionary and their types may vary across deployment targets.- Parameters
name – ID of deployment to fetch
endpoint – (optional) Endpoint containing the deployment to get. May not be supported by all targets
- Returns
A dict corresponding to the retrieved deployment. The dict is guaranteed to contain a ‘name’ key corresponding to the deployment name. The other fields of the returned dictionary and their types may vary across targets.
-
get_endpoint
(endpoint)[source] Returns a dictionary describing the specified endpoint, throwing a py:class:mlflow.exception.MlflowException if no endpoint exists with the provided name. The dict is guaranteed to contain an ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.
- Parameters
endpoint – Name of endpoint to fetch
- Returns
A dict corresponding to the retrieved endpoint. The dict is guaranteed to contain a ‘name’ key corresponding to the endpoint name. The other fields of the returned dictionary and their types may vary across targets.
-
abstract
list_deployments
(endpoint=None)[source] List deployments. This method is expected to return an unpaginated list of all deployments (an alternative would be to return a dict with a ‘deployments’ field containing the actual deployments, with plugins able to specify other fields, e.g. a next_page_token field, in the returned dictionary for pagination, and to accept a pagination_args argument to this method for passing pagination-related args).
- Parameters
endpoint – (optional) List deployments in the specified endpoint. May not be supported by all targets
- Returns
A list of dicts corresponding to deployments. Each dict is guaranteed to contain a ‘name’ key containing the deployment name. The other fields of the returned dictionary and their types may vary across deployment targets.
-
list_endpoints
()[source] List endpoints in the specified target. This method is expected to return an unpaginated list of all endpoints (an alternative would be to return a dict with an ‘endpoints’ field containing the actual endpoints, with plugins able to specify other fields, e.g. a next_page_token field, in the returned dictionary for pagination, and to accept a pagination_args argument to this method for passing pagination-related args).
- Returns
A list of dicts corresponding to endpoints. Each dict is guaranteed to contain a ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.
-
abstract
predict
(deployment_name=None, inputs=None, endpoint=None)[source] Compute predictions on inputs using the specified deployment or model endpoint. Note that the input/output types of this method match those of mlflow pyfunc predict.
- Parameters
deployment_name – Name of deployment to predict against
inputs – Input data (or arguments) to pass to the deployment or model endpoint for inference
endpoint – Endpoint to predict against. May not be supported by all targets
- Returns
A
mlflow.deployments.PredictionsResponse
instance representing the predictions and associated Model Server response metadata.
-
abstract
update_deployment
(name, model_uri=None, flavor=None, config=None, endpoint=None)[source] Update the deployment with the specified name. You can update the URI of the model, the flavor of the deployed model (in which case the model URI must also be specified), and/or any target-specific attributes of the deployment (via config). By default, this method should block until deployment completes (i.e. until it’s possible to perform inference with the updated deployment). See target-specific plugin documentation for additional detail on support for asynchronous deployment and other configuration.
- Parameters
name – Unique name of deployment to update
model_uri – URI of a new model to deploy.
flavor – (optional) new model flavor to use for deployment. If provided,
model_uri
must also be specified. Ifflavor
is unspecified butmodel_uri
is specified, a default flavor will be chosen and the deployment will be updated using that flavor.config – (optional) dict containing updated target-specific configuration for the deployment
endpoint – (optional) Endpoint containing the deployment to update. May not be supported by all targets
- Returns
None
-
update_endpoint
(endpoint, config=None)[source] Update the endpoint with the specified name. You can update any target-specific attributes of the endpoint (via config). By default, this method should block until the update completes (i.e. until it’s possible to create a deployment within the endpoint). See target-specific plugin documentation for additional detail on support for asynchronous update and other configuration.
- Parameters
endpoint – Unique name of endpoint to update
config – (optional) dict containing target-specific configuration for the endpoint
- Returns
None
-
abstract
-
mlflow.deployments.
get_deploy_client
(target_uri)[source] Returns a subclass of
mlflow.deployments.BaseDeploymentClient
exposing standard APIs for deploying models to the specified target. See available deployment APIs by callinghelp()
on the returned object or viewing docs formlflow.deployments.BaseDeploymentClient
. You can also runmlflow deployments help -t <target-uri>
via the CLI for more details on target-specific configuration options.- Parameters
target_uri – URI of target to deploy to.
from mlflow.deployments import get_deploy_client import pandas as pd client = get_deploy_client("redisai") # Deploy the model stored at artifact path 'myModel' under run with ID 'someRunId'. The # model artifacts are fetched from the current tracking server and then used for deployment. client.create_deployment("spamDetector", "runs:/someRunId/myModel") # Load a CSV of emails and score it against our deployment emails_df = pd.read_csv("...") prediction_df = client.predict_deployment("spamDetector", emails_df) # List all deployments, get details of our particular deployment print(client.list_deployments()) print(client.get_deployment("spamDetector")) # Update our deployment to serve a different model client.update_deployment("spamDetector", "runs:/anotherRunId/myModel") # Delete our deployment client.delete_deployment("spamDetector")
-
mlflow.deployments.
run_local
(target, name, model_uri, flavor=None, config=None)[source] Deploys the specified model locally, for testing. Note that models deployed locally cannot be managed by other deployment APIs (e.g.
update_deployment
,delete_deployment
, etc).- Parameters
target – Target to deploy to.
name – Name to use for deployment
model_uri – URI of model to deploy
flavor – (optional) Model flavor to deploy. If unspecified, a default flavor will be chosen.
config – (optional) Dict containing updated target-specific configuration for the deployment
- Returns
None
-
class
mlflow.deployments.
PredictionsResponse
[source] Represents the predictions and metadata returned in response to a scoring request, such as a REST API request sent to the
/invocations
endpoint of an MLflow Model Server.-
get_predictions
(predictions_format='dataframe', dtype=None)[source] Get the predictions returned from the MLflow Model Server in the specified format.
- Parameters
predictions_format – The format in which to return the predictions. Either
"dataframe"
or"ndarray"
.dtype – The NumPy datatype to which to coerce the predictions. Only used when the
"ndarray"
predictions_format
is specified.
- Throws
Exception if the predictions cannot be represented in the specified format.
- Returns
The predictions, represented in the specified format.
-
to_json
(path=None)[source] Get the JSON representation of the MLflow Predictions Response.
- Parameters
path – If specified, the JSON representation is written to this file path.
- Returns
If
path
is unspecified, the JSON representation of the MLflow Predictions Response. Else, None.
-