mlflow.tracking

The mlflow.tracking module provides a Python CRUD interface to MLflow experiments and runs. This is a lower level API that directly translates to MLflow REST API calls. For a higher level API for managing an “active run”, use the mlflow module.

class mlflow.tracking.MlflowClient(tracking_uri=None, registry_uri=None)[source]

Bases: object

Client of an MLflow Tracking Server that creates and manages experiments and runs, and of an MLflow Registry Server that creates and manages registered models and model versions. It’s a thin wrapper around TrackingServiceClient and RegistryClient so there is a unified API but we can keep the implementation of the tracking and registry clients independent from each other.

create_experiment(name, artifact_location=None)[source]

Create an experiment.

Parameters
  • name – The experiment name. Must be unique.

  • artifact_location – The location to store run artifacts. If not provided, the server picks an appropriate default.

Returns

String as an integer ID of the created experiment.

Example
from mlflow.tracking import MlflowClient

# Create an experiment with a name that is unique and case sensitive.
client = MlflowClient()
experiment_id = client.create_experiment("Social NLP Experiments")
client.set_experiment_tag(experiment_id, "nlp.framework", "Spark NLP")

# Fetch experiment metadata information
experiment = client.get_experiment(experiment_id)
print("Name: {}".format(experiment.name))
print("Experiment_id: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Tags: {}".format(experiment.tags))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
Output
Name: Social NLP Experiments
Experiment_id: 1
Artifact Location: file:///.../mlruns/1
Tags: {'nlp.framework': 'Spark NLP'}
Lifecycle_stage: active
create_model_version(name, source, run_id=None, tags=None, run_link=None, description=None, await_creation_for=300)[source]

Create a new model version from given source (artifact URI).

Parameters
  • name – Name for the containing registered model.

  • source – Source path where the MLflow model is stored.

  • run_id – Run ID from MLflow tracking server that generated the model

  • tags – A dictionary of key-value pairs that are converted into mlflow.entities.model_registry.ModelVersionTag objects.

  • run_link – Link to the run from an MLflow tracking server that generated this model.

  • description – Description of the version.

  • await_creation_for – Number of seconds to wait for the model version to finish being created and is in READY status. By default, the function waits for five minutes. Specify 0 or None to skip waiting.

Returns

Single mlflow.entities.model_registry.ModelVersion object created by backend.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
desc = "A new version of the model"
model_uri = "runs:/{}/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id, description=desc)
print("Name: {}".format(mv.name))
print("Version: {}".format(mv.version))
print("Description: {}".format(mv.description))
print("Status: {}".format(mv.status))
print("Stage: {}".format(mv.current_stage))
Output
Name: RandomForestRegression
Version: 1
Description: A new version of the model
Status: READY
Stage: None
create_registered_model(name, tags=None, description=None)[source]

Create a new registered model in backend store.

Parameters
  • name – Name of the new model. This is expected to be unique in the backend store.

  • tags – A dictionary of key-value pairs that are converted into mlflow.entities.model_registry.RegisteredModelTag objects.

  • description – Description of the model.

Returns

A single object of mlflow.entities.model_registry.RegisteredModel created by backend.

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_registered_model_info(rm):
    print("name: {}".format(rm.name))
    print("tags: {}".format(rm.tags))
    print("description: {}".format(rm.description))

name = "SocialMediaTextAnalyzer"
tags = {"nlp.framework": "Spark NLP"}
desc = "This sentiment analysis model classifies the tone-happy, sad, angry."

mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()
client.create_registered_model(name, tags, desc)
print_registered_model_info(client.get_registered_model(name))
Output
name: SocialMediaTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies the tone-happy, sad, angry.
create_run(experiment_id, start_time=None, tags=None)[source]

Create a mlflow.entities.Run object that can be associated with metrics, parameters, artifacts, etc. Unlike mlflow.projects.run(), creates objects but does not run code. Unlike mlflow.start_run(), does not change the “active run” used by mlflow.log_param().

Parameters
  • experiment_id – The string ID of the experiment to create a run in.

  • start_time – If not provided, use the current timestamp.

  • tags – A dictionary of key-value pairs that are converted into mlflow.entities.RunTag objects.

Returns

mlflow.entities.Run that was created.

Example
from mlflow.tracking import MlflowClient

# Create a run with a tag under the default experiment (whose id is '0').
tags = {"engineering": "ML Platform"}
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id, tags=tags)

# Show newly created run metadata info
print("Run tags: {}".format(run.data.tags))
print("Experiment id: {}".format(run.info.experiment_id))
print("Run id: {}".format(run.info.run_id))
print("lifecycle_stage: {}".format(run.info.lifecycle_stage))
print("status: {}".format(run.info.status))
Output
Run tags: {'engineering': 'ML Platform'}
Experiment id: 0
Run id: 65fb9e2198764354bab398105f2e70c1
lifecycle_stage: active
status: RUNNING
delete_experiment(experiment_id)[source]

Delete an experiment from the backend store.

Parameters

experiment_id – The experiment ID returned from create_experiment.

Example
from mlflow.tracking import MlflowClient

# Create an experiment with a name that is unique and case sensitive
client = MlflowClient()
experiment_id = client.create_experiment("New Experiment")
client.delete_experiment(experiment_id)

# Examine the deleted experiment details.
experiment = client.get_experiment(experiment_id)
print("Name: {}".format(experiment.name))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
Output
Name: New Experiment
Artifact Location: file:///.../mlruns/1
Lifecycle_stage: deleted
delete_model_version(name, version)[source]

Delete model version in backend.

Parameters
  • name – Name of the containing registered model.

  • version – Version number of the model version.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_models_info(mv):
    for m in mv:
        print("name: {}".format(m.name))
        print("latest version: {}".format(m.version))
        print("run_id: {}".format(m.run_id))
        print("current_stage: {}".format(m.current_stage))

mlflow.set_tracking_uri("sqlite:///mlruns.db")

# Create two runs and log MLflow entities
with mlflow.start_run() as run1:
    params = {"n_estimators": 3, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

with mlflow.start_run() as run2:
    params = {"n_estimators": 6, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
name = "RandomForestRegression"
client = MlflowClient()
client.create_registered_model(name)

# Create a two versions of the rfr model under the registered model name
for run_id in [run1.info.run_id, run2.info.run_id]:
    model_uri = "runs:/{}/sklearn-model".format(run_id)
    mv = client.create_model_version(name, model_uri, run_id)
    print("model version {} created".format(mv.version))

print("--")

# Fetch latest version; this will be version 2
models = client.get_latest_versions(name, stages=["None"])
print_models_info(models)
print("--")

# Delete the latest model version 2
print("Deleting model version {}".format(mv.version))
client.delete_model_version(name, mv.version)
models = client.get_latest_versions(name, stages=["None"])
print_models_info(models)
Output
model version 1 created
model version 2 created
--
name: RandomForestRegression
latest version: 2
run_id: 9881172ef10f4cb08df3ed452c0c362b
current_stage: None
--
Deleting model version 2
name: RandomForestRegression
latest version: 1
run_id: 9165d4f8aa0a4d069550824bdc55caaf
current_stage: None
delete_model_version_tag(name, version, key)[source]

Delete a tag associated with the model version.

Parameters
  • name – Registered model name.

  • version – Registered model version.

  • key – Tag key.

Returns

None

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_model_version_info(mv):
    print("Name: {}".format(mv.name))
    print("Version: {}".format(mv.version))
    print("Tags: {}".format(mv.tags))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
# and delete a tag
model_uri = "runs:/{}/sklearn-model".format(run.info.run_id)
tags = {'t': "t1"}
mv = client.create_model_version(name, model_uri, run.info.run_id, tags=tags)
print_model_version_info(mv)
print("--")
client.delete_model_version_tag(name, mv.version, "t")
mv = client.get_model_version(name, mv.version)
print_model_version_info(mv)
Output
Name: RandomForestRegression
Version: 1
Tags: {'t': 't1'}
--
Name: RandomForestRegression
Version: 1
Tags: {}
delete_registered_model(name)[source]

Delete registered model. Backend raises exception if a registered model with given name does not exist.

Parameters

name – Name of the registered model to update.

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_registered_models_info(r_models):
    print("--")
    for rm in r_models:
        print("name: {}".format(rm.name))
        print("tags: {}".format(rm.tags))
        print("description: {}".format(rm.description))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()

# Register a couple of models with respective names, tags, and descriptions
for name, tags, desc in [("name1", {"t1": "t1"}, 'description1'),
                         ("name2", {"t2": "t2"}, 'description2')]:
    client.create_registered_model(name, tags, desc)

# Fetch all registered models
print_registered_models_info(client.list_registered_models())

# Delete one registered model and fetch again
client.delete_registered_model("name1")
print_registered_models_info(client.list_registered_models())
Output
--
name: name1
tags: {'t1': 't1'}
description: description1
name: name2
tags: {'t2': 't2'}
description: description2
--
name: name2
tags: {'t2': 't2'}
description: description2
delete_registered_model_tag(name, key)[source]

Delete a tag associated with the registered model.

Parameters
  • name – Registered model name.

  • key – Registered model tag key.

Returns

None

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_registered_models_info(r_models):
    print("--")
    for rm in r_models:
        print("name: {}".format(rm.name))
        print("tags: {}".format(rm.tags))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()

# Register a couple of models with respective names and tags
for name, tags in [("name1", {"t1": "t1"}),("name2", {"t2": "t2"})]:
    client.create_registered_model(name, tags)

# Fetch all registered models
print_registered_models_info(client.list_registered_models())

# Delete a tag from model `name2`
client.delete_registered_model_tag("name2", 't2')
print_registered_models_info(client.list_registered_models())
Output
--
name: name1
tags: {'t1': 't1'}
name: name2
tags: {'t2': 't2'}
--
name: name1
tags: {'t1': 't1'}
name: name2
tags: {}
delete_run(run_id)[source]

Deletes a run with the given ID.

Parameters

run_id – The unique run id to delete.

Example
from mlflow.tracking import MlflowClient

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
run_id = run.info.run_id
print("run_id: {}; lifecycle_stage: {}".format(run_id, run.info.lifecycle_stage))
print("--")
client.delete_run(run_id)
del_run = client.get_run(run_id)
print("run_id: {}; lifecycle_stage: {}".format(run_id, del_run.info.lifecycle_stage))
Output
run_id: a61c7a1851324f7094e8d5014c58c8c8; lifecycle_stage: active
run_id: a61c7a1851324f7094e8d5014c58c8c8; lifecycle_stage: deleted
delete_tag(run_id, key)[source]

Delete a tag from a run. This is irreversible.

Parameters
  • run_id – String ID of the run

  • key – Name of the tag

Example
from mlflow.tracking import MlflowClient

def print_run_info(run):
    print("run_id: {}".format(run.info.run_id))
    print("Tags: {}".format(run.data.tags))

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
tags = {"t1": 1, "t2": 2}
experiment_id = "0"
run = client.create_run(experiment_id, tags=tags)
print_run_info(run)
print("--")

# Delete tag and fetch updated info
client.delete_tag(run.info.run_id, "t1")
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: b7077267a59a45d78cd9be0de4bc41f5
Tags: {'t2': '2', 't1': '1'}
--
run_id: b7077267a59a45d78cd9be0de4bc41f5
Tags: {'t2': '2'}
download_artifacts(run_id, path, dst_path=None)[source]

Download an artifact file or directory from a run to a local directory if applicable, and return a local path for it.

Parameters
  • run_id – The run to download artifacts from.

  • path – Relative source path to the desired artifact.

  • dst_path – Absolute path of the local filesystem destination directory to which to download the specified artifacts. This directory must already exist. If unspecified, the artifacts will either be downloaded to a new uniquely-named directory on the local filesystem or will be returned directly in the case of the LocalArtifactRepository.

Returns

Local path of desired artifact.

Example
import os
import mlflow
from mlflow.tracking import MlflowClient

features = "rooms, zipcode, median_price, school_rating, transport"
with open("features.txt", 'w') as f:
    f.write(features)

# Log artifacts
with mlflow.start_run() as run:
    mlflow.log_artifact("features.txt", artifact_path="features")

# Download artifacts
client = MlflowClient()
local_dir = "/tmp/artifact_downloads"
if not os.path.exists(local_dir):
    os.mkdir(local_dir)
local_path = client.download_artifacts(run.info.run_id, "features", local_dir)
print("Artifacts downloaded in: {}".format(local_path))
print("Artifacts: {}".format(os.listdir(local_path)))
Output
Artifacts downloaded in: /tmp/artifact_downloads/features
Artifacts: ['features.txt']
get_experiment(experiment_id)[source]

Retrieve an experiment by experiment_id from the backend store

Parameters

experiment_id – The experiment ID returned from create_experiment.

Returns

mlflow.entities.Experiment

Example
from mlflow.tracking import MlflowClient

client = MlflowClient()
exp_id = client.create_experiment("Experiment")
experiment = client.get_experiment(exp_id)

# Show experiment info
print("Name: {}".format(experiment.name))
print("Experiment ID: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
Output
Name: Experiment
Experiment ID: 1
Artifact Location: file:///.../mlruns/1
Lifecycle_stage: active
get_experiment_by_name(name)[source]

Retrieve an experiment by experiment name from the backend store

Parameters

name – The experiment name, which is case sensitive.

Returns

mlflow.entities.Experiment

Example
from mlflow.tracking import MlflowClient

# Case-sensitive name
client = MlflowClient()
experiment = client.get_experiment_by_name("Default")

# Show experiment info
print("Name: {}".format(experiment.name))
print("Experiment ID: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
Output
Name: Default
Experiment ID: 0
Artifact Location: file:///.../mlruns/0
Lifecycle_stage: active
get_latest_versions(name, stages=None)[source]

Latest version models for each requests stage. If no stages provided, returns the latest version for each stage.

Parameters
  • name – Name of the registered model to update.

  • stages – List of desired stages. If input list is None, return latest versions for for ALL_STAGES.

Returns

List of mlflow.entities.model_registry.ModelVersion objects.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_models_info(mv):
    for m in mv:
        print("name: {}".format(m.name))
        print("latest version: {}".format(m.version))
        print("run_id: {}".format(m.run_id))
        print("current_stage: {}".format(m.current_stage))

mlflow.set_tracking_uri("sqlite:///mlruns.db")

# Create two runs Log MLflow entities
with mlflow.start_run() as run1:
    params = {"n_estimators": 3, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

with mlflow.start_run() as run2:
    params = {"n_estimators": 6, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
name = "RandomForestRegression"
client = MlflowClient()
client.create_registered_model(name)

# Create a two versions of the rfr model under the registered model name
for run_id in [run1.info.run_id, run2.info.run_id]:
    model_uri = "runs:/{}/sklearn-model".format(run_id)
    mv = client.create_model_version(name, model_uri, run_id)
    print("model version {} created".format(mv.version))

# Fetch latest version; this will be version 2
print("--")
print_models_info(client.get_latest_versions(name, stages=["None"]))
Output
model version 1 created
model version 2 created
--
name: RandomForestRegression
latest version: 2
run_id: 31165664be034dc698c52a4bdeb71663
current_stage: None
get_metric_history(run_id, key)[source]

Return a list of metric objects corresponding to all values logged for a given metric.

Parameters
  • run_id – Unique identifier for run

  • key – Metric name within the run

Returns

A list of mlflow.entities.Metric entities if logged, else empty list

Example
from mlflow.tracking import MlflowClient

def print_metric_info(history):
    for m in history:
        print("name: {}".format(m.key))
        print("value: {}".format(m.value))
        print("step: {}".format(m.step))
        print("timestamp: {}".format(m.timestamp))
        print("--")

# Create a run under the default experiment (whose id is "0"). Since this is low-level
# CRUD operation, the method will create a run. To end the run, you'll have
# to explicitly end it.
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
print("run_id: {}".format(run.info.run_id))
print("--")

# Log couple of metrics, update their initial value, and fetch each
# logged metrics' history.
for k, v in [("m1", 1.5), ("m2", 2.5)]:
    client.log_metric(run.info.run_id, k, v, step=0)
    client.log_metric(run.info.run_id, k, v + 1, step=1)
    print_metric_info(client.get_metric_history(run.info.run_id, k))
client.set_terminated(run.info.run_id)
Output
run_id: c360d15714994c388b504fe09ea3c234
--
name: m1
value: 1.5
step: 0
timestamp: 1603423788607
--
name: m1
value: 2.5
step: 1
timestamp: 1603423788608
--
name: m2
value: 2.5
step: 0
timestamp: 1603423788609
--
name: m2
value: 3.5
step: 1
timestamp: 1603423788610
--
get_model_version(name, version)[source]
Parameters
  • name – Name of the containing registered model.

  • version – Version number as an integer of the model version.

Returns

A single mlflow.entities.model_registry.ModelVersion object.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

# Create two runs Log MLflow entities
with mlflow.start_run() as run1:
    params = {"n_estimators": 3, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

with mlflow.start_run() as run2:
    params = {"n_estimators": 6, "random_state": 42}
    rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
name = "RandomForestRegression"
client = MlflowClient()
client.create_registered_model(name)

# Create a two versions of the rfr model under the registered model name
for run_id in [run1.info.run_id, run2.info.run_id]:
    model_uri = "runs:/{}/sklearn-model".format(run_id)
    mv = client.create_model_version(name, model_uri, run_id)
    print("model version {} created".format(mv.version))
print("--")

# Fetch the last version; this will be version 2
mv = client.get_model_version(name, mv.version)
print_model_version_info(mv)
Output
model version 1 created
model version 2 created
--
Name: RandomForestRegression
Version: 2
get_model_version_download_uri(name, version)[source]

Get the download location in Model Registry for this model version.

Parameters
  • name – Name of the containing registered model.

  • version – Version number as an integer of the model version.

Returns

A single URI location that allows reads for downloading.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="models/sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
model_uri = "runs:/{}/models/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id)
artifact_uri = client.get_model_version_download_uri(name, mv.version)
print("Download URI: {}".format(artifact_uri))
Output
Download URI: runs:/44e04097ac364cd895f2039eaccca9ac/models/sklearn-model
get_model_version_stages(name, version)[source]
Returns

A list of valid stages.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="models/sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
# fetch valid stages
model_uri = "runs:/{}/models/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id)
stages = client.get_model_version_stages(name, mv.version)
print("Model list of valid stages: {}".format(stages))
Output
Model list of valid stages: ['None', 'Staging', 'Production', 'Archived']
get_registered_model(name)[source]
Parameters

name – Name of the registered model to update.

Returns

A single mlflow.entities.model_registry.RegisteredModel object.

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_model_info(rm):
    print("--")
    print("name: {}".format(rm.name))
    print("tags: {}".format(rm.tags))
    print("description: {}".format(rm.description))

name = "SocialMediaTextAnalyzer"
tags = {"nlp.framework": "Spark NLP"}
desc = "This sentiment analysis model classifies the tone-happy, sad, angry."
mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()

# Create and fetch the registered model
client.create_registered_model(name, tags, desc)
model = client.get_registered_model(name)
print_model_info(model)
Output
--
name: SocialMediaTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies the tone-happy, sad, angry.
get_run(run_id)[source]

Fetch the run from backend store. The resulting Run contains a collection of run metadata – RunInfo, as well as a collection of run parameters, tags, and metrics – RunData. In the case where multiple metrics with the same key are logged for the run, the RunData contains the most recently logged value at the largest step for each metric.

Parameters

run_id – Unique identifier for the run.

Returns

A single mlflow.entities.Run object, if the run exists. Otherwise, raises an exception.

Example
import mlflow
from mlflow.tracking import MlflowClient

with mlflow.start_run() as run:
    mlflow.log_param("p", 0)

# The run has finished since we have exited the with block
# Fetch the run
client = MlflowClient()
run = client.get_run(run.info.run_id)
print("run_id: {}".format(run.info.run_id))
print("params: {}".format(run.data.params))
print("status: {}".format(run.info.status))
Output
run_id: e36b42c587a1413ead7c3b6764120618
params: {'p': '0'}
status: FINISHED
list_artifacts(run_id, path=None)[source]

List the artifacts for a run.

Parameters
  • run_id – The run to list artifacts from.

  • path – The run’s relative artifact path to list from. By default it is set to None or the root artifact path.

Returns

List of mlflow.entities.FileInfo

Example
from mlflow.tracking import MlflowClient

 def print_artifact_info(artifact):
    print("artifact: {}".format(artifact.path))
    print("is_dir: {}".format(artifact.is_dir))
    print("size: {}".format(artifact.file_size))

features = "rooms zipcode, median_price, school_rating, transport"
labels = "price"

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)

# Create some artifacts and log under the above run
for file, content in [("features", features), ("labels", labels)]:
    with open("{}.txt".format(file), 'w') as f:
        f.write(content)
    client.log_artifact(run.info.run_id, "{}.txt".format(file))

# Fetch the logged artifacts
artifacts = client.list_artifacts(run.info.run_id)
for artifact in artifacts:
    print_artifact_info(artifact)
client.set_terminated(run.info.run_id)
Output
artifact: features.txt
is_dir: False
size: 53
artifact: labels.txt
is_dir: False
size: 5
list_experiments(view_type=None)[source]
Returns

List of mlflow.entities.Experiment

Example
from mlflow.tracking import MlflowClient
from mlflow.entities import ViewType

def print_experiment_info(experiments):
    for e in experiments:
        print("- experiment_id: {}, name: {}, lifecycle_stage: {}"
              .format(e.experiment_id, e.name, e.lifecycle_stage))

client = MlflowClient()
for name in ["Experiment 1", "Experiment 2"]:
    exp_id = client.create_experiment(name)

# Delete the last experiment
client.delete_experiment(exp_id)

# Fetch experiments by view type
print("Active experiments:")
print_experiment_info(client.list_experiments(view_type=ViewType.ACTIVE_ONLY))
print("Deleted experiments:")
print_experiment_info(client.list_experiments(view_type=ViewType.DELETED_ONLY))
print("All experiments:")
print_experiment_info(client.list_experiments(view_type=ViewType.ALL))
Output
Active experiments:
- experiment_id: 0, name: Default, lifecycle_stage: active
- experiment_id: 1, name: Experiment 1, lifecycle_stage: active
Deleted experiments:
- experiment_id: 2, name: Experiment 2, lifecycle_stage: deleted
All experiments:
- experiment_id: 0, name: Default, lifecycle_stage: active
- experiment_id: 1, name: Experiment 1, lifecycle_stage: active
- experiment_id: 2, name: Experiment 2, lifecycle_stage: deleted
list_registered_models(max_results=100, page_token=None)[source]

List of all registered models

Parameters
  • max_results – Maximum number of registered models desired.

  • page_token – Token specifying the next page of results. It should be obtained from a list_registered_models call.

Returns

A PagedList of mlflow.entities.model_registry.RegisteredModel objects that can satisfy the search expressions. The pagination token for the next page can be obtained via the token attribute of the object.

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_model_info(models):
    for m in models:
        print("--")
        print("name: {}".format(m.name))
        print("tags: {}".format(m.tags))
        print("description: {}".format(m.description))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()

# Register a couple of models with respective names, tags, and descriptions
for name, tags, desc in [("name1", {"t1": "t1"}, 'description1'),
                         ("name2", {"t2": "t2"}, 'description2')]:
    client.create_registered_model(name, tags, desc)

# Fetch all registered models
print_model_info(client.list_registered_models())
Output
--
name: name1
tags: {'t1': 't1'}
description: description1
--
name: name2
tags: {'t2': 't2'}
description: description2
list_run_infos(experiment_id, run_view_type=1, max_results=1000, order_by=None, page_token=None)[source]
Returns

List of mlflow.entities.RunInfo

Example
import mlflow
from mlflow.tracking import MlflowClient
from mlflow.entities import ViewType

def print_run_infos(run_infos):
    for r in run_infos:
        print("- run_id: {}, lifecycle_stage: {}".format(r.run_id, r.lifecycle_stage))

# Create two runs
with mlflow.start_run() as run1:
    mlflow.log_metric("click_rate", 1.55)

with mlflow.start_run() as run2:
    mlflow.log_metric("click_rate", 2.50)

# Delete the last run
client = MlflowClient()
client.delete_run(run2.info.run_id)

# Get all runs under the default experiment (whose id is 0)
print("Active runs:")
print_run_infos(mlflow.list_run_infos("0", run_view_type=ViewType.ACTIVE_ONLY))

print("Deleted runs:")
print_run_infos(mlflow.list_run_infos("0", run_view_type=ViewType.DELETED_ONLY))

print("All runs:")
print_run_infos(mlflow.list_run_infos("0", run_view_type=ViewType.ALL,
                order_by=["metric.click_rate DESC"]))
Output
Active runs:
- run_id: 47b11b33f9364ee2b148c41375a30a68, lifecycle_stage: active
Deleted runs:
- run_id: bc4803439bdd4a059103811267b6b2f4, lifecycle_stage: deleted
All runs:
- run_id: bc4803439bdd4a059103811267b6b2f4, lifecycle_stage: deleted
- run_id: 47b11b33f9364ee2b148c41375a30a68, lifecycle_stage: active
log_artifact(run_id, local_path, artifact_path=None)[source]

Write a local file or directory to the remote artifact_uri.

Parameters
  • local_path – Path to the file or directory to write.

  • artifact_path – If provided, the directory in artifact_uri to write to.

Example
from mlflow.tracking import MlflowClient

features = "rooms, zipcode, median_price, school_rating, transport"
with open("features.txt", 'w') as f:
    f.write(features)

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)

# log and fetch the artifact
client.log_artifact(run.info.run_id, "features.txt")
artifacts = client.list_artifacts(run.info.run_id)
for artifact in artifacts:
    print("artifact: {}".format(artifact.path))
    print("is_dir: {}".format(artifact.is_dir))
client.set_terminated(run.info.run_id)
Output
artifact: features.txt
is_dir: False
log_artifacts(run_id, local_dir, artifact_path=None)[source]

Write a directory of files to the remote artifact_uri.

Parameters
  • local_dir – Path to the directory of files to write.

  • artifact_path – If provided, the directory in artifact_uri to write to.

Example
import os
import json

# Create some artifacts data to preserve
features = "rooms, zipcode, median_price, school_rating, transport"
data = {"state": "TX", "Available": 25, "Type": "Detached"}

# Create couple of artifact files under the local directory "data"
os.makedirs("data", exist_ok=True)
with open("data/data.json", 'w', encoding='utf-8') as f:
    json.dump(data, f, indent=2)
with open("data/features.txt", 'w') as f:
    f.write(features)

# Create a run under the default experiment (whose id is '0'), and log
# all files in "data" to root artifact_uri/states
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
client.log_artifacts(run.info.run_id, "data", artifact_path="states")
artifacts = client.list_artifacts(run.info.run_id)
for artifact in artifacts:
    print("artifact: {}".format(artifact.path))
    print("is_dir: {}".format(artifact.is_dir))
client.set_terminated(run.info.run_id)
Output
artifact: states
is_dir: True
log_batch(run_id, metrics=(), params=(), tags=())[source]

Log multiple metrics, params, and/or tags.

Parameters
  • run_id – String ID of the run

  • metrics – If provided, List of Metric(key, value, timestamp) instances.

  • params – If provided, List of Param(key, value) instances.

  • tags – If provided, List of RunTag(key, value) instances.

Raises an MlflowException if any errors occur. :return: None

Example
import time

from mlflow.tracking import MlflowClient
from mlflow.entities import Metric, Param, RunTag

def print_run_info(r):
    print("run_id: {}".format(r.info.run_id))
    print("params: {}".format(r.data.params))
    print("metrics: {}".format(r.data.metrics))
    print("tags: {}".format(r.data.tags))
    print("status: {}".format(r.info.status))

# Create MLflow entities and a run under the default experiment (whose id is '0').
timestamp = int(time.time() * 1000)
metrics = [Metric('m', 1.5, timestamp, 1)]
params = [Param("p", 'p')]
tags = [RunTag("t", "t")]
experiment_id = "0"
client = MlflowClient()
run = client.create_run(experiment_id)

# Log entities, terminate the run, and fetch run status
client.log_batch(run.info.run_id, metrics=metrics, params=params, tags=tags)
client.set_terminated(run.info.run_id)
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: ef0247fa3205410595acc0f30f620871
params: {'p': 'p'}
metrics: {'m': 1.5}
tags: {'t': 't'}
status: FINISHED
log_dict(run_id, dictionary, artifact_file)[source]

Note

Experimental: This method may change or be removed in a future release without warning.

Log a dictionary as an artifact. The serialization format (JSON or YAML) is automatically inferred from the extension of artifact_file. If the file extension doesn’t exist or match any of [“.json”, “.yml”, “.yaml”], JSON format is used.

Parameters
  • run_id – String ID of the run.

  • dictionary – Dictionary to log.

  • artifact_file – The run-relative artifact file path in posixpath format to which the dictionary is saved (e.g. “dir/data.json”).

Example
from mlflow.tracking import MlflowClient

client = MlflowClient()
run = client.create_run(experiment_id="0")
run_id = run.info.run_id

dictionary = {"k": "v"}

# Log a dictionary as a JSON file under the run's root artifact directory
client.log_dict(run_id, dictionary, "data.json")

# Log a dictionary as a YAML file in a subdirectory of the run's root artifact directory
client.log_dict(run_id, dictionary, "dir/data.yml")

# If the file extension doesn't exist or match any of [".json", ".yaml", ".yml"],
# JSON format is used.
mlflow.log_dict(run_id, dictionary, "data")
mlflow.log_dict(run_id, dictionary, "data.txt")
log_figure(run_id, figure, artifact_file)[source]

Note

Experimental: This method may change or be removed in a future release without warning.

Log a figure as an artifact. The following figure objects are supported:

Parameters
  • run_id – String ID of the run.

  • figure – Figure to log.

  • artifact_file – The run-relative artifact file path in posixpath format to which the figure is saved (e.g. “dir/file.png”).

Matplotlib Example
import mlflow
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot([0, 1], [2, 3])

run = client.create_run(experiment_id="0")
client.log_figure(run.info.run_id, fig, "figure.png")
Plotly Example
import mlflow
from plotly import graph_objects as go

fig = go.Figure(go.Scatter(x=[0, 1], y=[2, 3]))

run = client.create_run(experiment_id="0")
client.log_figure(run.info.run_id, fig, "figure.html")
log_image(run_id, image, artifact_file)[source]

Note

Experimental: This method may change or be removed in a future release without warning.

Log an image as an artifact. The following image objects are supported:

Numpy array support
  • data type (( ) represents a valid value range):

    • bool

    • integer (0 ~ 255)

    • unsigned integer (0 ~ 255)

    • float (0.0 ~ 1.0)

    Warning

    • Out-of-range integer values will be clipped to [0, 255].

    • Out-of-range float values will be clipped to [0, 1].

  • shape (H: height, W: width):

    • H x W (Grayscale)

    • H x W x 1 (Grayscale)

    • H x W x 3 (an RGB channel order is assumed)

    • H x W x 4 (an RGBA channel order is assumed)

Parameters
  • run_id – String ID of the run.

  • image – Image to log.

  • artifact_file – The run-relative artifact file path in posixpath format to which the image is saved (e.g. “dir/image.png”).

Numpy Example
import mlflow
import numpy as np

image = np.random.randint(0, 256, size=(100, 100, 3), dtype=np.uint8)

run = client.create_run(experiment_id="0")
client.log_image(run.info.run_id, image, "image.png")
Pillow Example
import mlflow
from PIL import Image

image = Image.new("RGB", (100, 100))

run = client.create_run(experiment_id="0")
client.log_image(run.info.run_id, image, "image.png")
log_metric(run_id, key, value, timestamp=None, step=None)[source]

Log a metric against the run ID.

Parameters
  • run_id – The run id to which the metric should be logged.

  • key – Metric name.

  • value – Metric value (float). Note that some special values such as +/- Infinity may be replaced by other values depending on the store. For example, the SQLAlchemy store replaces +/- Inf with max / min float values.

  • timestamp – Time when this metric was calculated. Defaults to the current system time.

  • step – Integer training step (iteration) at which was the metric calculated. Defaults to 0.

Example
from mlflow.tracking import MlflowClient

def print_run_info(r):
    print("run_id: {}".format(r.info.run_id))
    print("metrics: {}".format(r.data.metrics))
    print("status: {}".format(r.info.status))

# Create a run under the default experiment (whose id is '0').
# Since these are low-level CRUD operations, this method will create a run.
# To end the run, you'll have to explicitly end it.
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
print_run_info(run)
print("--")

# Log the metric. Unlike mlflow.log_metric this method
# does not start a run if one does not exist. It will log
# the metric for the run id in the backend store.
client.log_metric(run.info.run_id, "m", 1.5)
client.set_terminated(run.info.run_id)
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: 95e79843cb2c463187043d9065185e24
metrics: {}
status: RUNNING
--
run_id: 95e79843cb2c463187043d9065185e24
metrics: {'m': 1.5}
status: FINISHED
log_param(run_id, key, value)[source]

Log a parameter against the run ID.

Parameters
  • run_id – The run id to which the param should be logged.

  • value – Value is converted to a string.

Example
from mlflow.tracking import MlflowClient

def print_run_info(r):
    print("run_id: {}".format(r.info.run_id))
    print("params: {}".format(r.data.params))
    print("status: {}".format(r.info.status))

# Create a run under the default experiment (whose id is '0').
# Since these are low-level CRUD operations, this method will create a run.
# To end the run, you'll have to explicitly end it.
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
print_run_info(run)
print("--")

# Log the parameter. Unlike mlflow.log_param this method
# does not start a run if one does not exist. It will log
# the parameter in the backend store
client.log_param(run.info.run_id, "p", 1)
client.set_terminated(run.info.run_id)
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: e649e49c7b504be48ee3ae33c0e76c93
params: {}
status: RUNNING
--
run_id: e649e49c7b504be48ee3ae33c0e76c93
params: {'p': '1'}
status: FINISHED
log_text(run_id, text, artifact_file)[source]

Log text as an artifact.

Parameters
  • run_id – String ID of the run.

  • text – String containing text to log.

  • artifact_file – The run-relative artifact file path in posixpath format to which the text is saved (e.g. “dir/file.txt”).

Example
from mlflow.tracking import MlflowClient

client = MlflowClient()
run = client.create_run(experiment_id="0")

# Log text to a file under the run's root artifact directory
client.log_text(run.info.run_id, "text1", "file1.txt")

# Log text in a subdirectory of the run's root artifact directory
client.log_text(run.info.run_id, "text2", "dir/file2.txt")

# Log HTML text
client.log_text(run.info.run_id, "<h1>header</h1>", "index.html")
rename_experiment(experiment_id, new_name)[source]

Update an experiment’s name. The new name must be unique.

Parameters

experiment_id – The experiment ID returned from create_experiment.

Example
from mlflow.tracking import MlflowClient

def print_experiment_info(experiment):
    print("Name: {}".format(experiment.name))
    print("Experiment_id: {}".format(experiment.experiment_id))
    print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))

# Create an experiment with a name that is unique and case sensitive
client = MlflowClient()
experiment_id = client.create_experiment("Social NLP Experiments")

# Fetch experiment metadata information
experiment = client.get_experiment(experiment_id)
print_experiment_info(experiment)
print("--")

# Rename and fetch experiment metadata information
client.rename_experiment(experiment_id, "Social Media NLP Experiments")
experiment = client.get_experiment(experiment_id)
print_experiment_info(experiment)
Output
Name: Social NLP Experiments
Experiment_id: 1
Lifecycle_stage: active
--
Name: Social Media NLP Experiments
Experiment_id: 1
Lifecycle_stage: active
rename_registered_model(name, new_name)[source]

Update registered model name.

Parameters
  • name – Name of the registered model to update.

  • new_name – New proposed name for the registered model.

Returns

A single updated mlflow.entities.model_registry.RegisteredModel object.

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_registered_model_info(rm):
    print("name: {}".format(rm.name))
    print("tags: {}".format(rm.tags))
    print("description: {}".format(rm.description))

name = "SocialTextAnalyzer"
tags = {"nlp.framework": "Spark NLP"}
desc = "This sentiment analysis model classifies the tone-happy, sad, angry."

# create a new registered model name
mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()
client.create_registered_model(name, tags, desc)
print_registered_model_info(client.get_registered_model(name))
print("--")

# rename the model
new_name = "SocialMediaTextAnalyzer"
client.rename_registered_model(name, new_name)
print_registered_model_info(client.get_registered_model(new_name))
Output
name: SocialTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies the tone-happy, sad, angry.
--
name: SocialMediaTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies the tone-happy, sad, angry.
restore_experiment(experiment_id)[source]

Restore a deleted experiment unless permanently deleted.

Parameters

experiment_id – The experiment ID returned from create_experiment.

Example
from mlflow.tracking import MlflowClient

def print_experiment_info(experiment):
    print("Name: {}".format(experiment.name))
    print("Experiment Id: {}".format(experiment.experiment_id))
    print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))

# Create and delete an experiment
client = MlflowClient()
experiment_id = client.create_experiment("New Experiment")
client.delete_experiment(experiment_id)

# Examine the deleted experiment details.
experiment = client.get_experiment(experiment_id)
print_experiment_info(experiment)
print("--")

# Restore the experiment and fetch its info
client.restore_experiment(experiment_id)
experiment = client.get_experiment(experiment_id)
print_experiment_info(experiment)
Output
Name: New Experiment
Experiment Id: 1
Lifecycle_stage: deleted
--
Name: New Experiment
Experiment Id: 1
Lifecycle_stage: active
restore_run(run_id)[source]

Restores a deleted run with the given ID.

Parameters

run_id – The unique run id to restore.

Example
from mlflow.tracking import MlflowClient

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
run_id = run.info.run_id
print("run_id: {}; lifecycle_stage: {}".format(run_id, run.info.lifecycle_stage))
client.delete_run(run_id)
del_run = client.get_run(run_id)
print("run_id: {}; lifecycle_stage: {}".format(run_id, del_run.info.lifecycle_stage))
client.restore_run(run_id)
rest_run = client.get_run(run_id)
print("run_id: {}; lifecycle_stage: {}".format(run_id, res_run.info.lifecycle_stage))
Output
run_id: 7bc59754d7e74534a7917d62f2873ac0; lifecycle_stage: active
run_id: 7bc59754d7e74534a7917d62f2873ac0; lifecycle_stage: deleted
run_id: 7bc59754d7e74534a7917d62f2873ac0; lifecycle_stage: active
search_model_versions(filter_string)[source]

Search for model versions in backend that satisfy the filter criteria.

Parameters

filter_string – A filter string expression. Currently, it supports a single filter condition either a name of model like name = 'model_name' or run_id = '...'.

Returns

PagedList of mlflow.entities.model_registry.ModelVersion objects.

Example
import mlflow
from mlflow.tracking import MlflowClient

client = MlflowClient()

# Get all versions of the model filtered by name
model_name = "CordobaWeatherForecastModel"
filter_string = "name='{}'".format(model_name)
results = client.search_model_versions(filter_string)
print("-" * 80)
for res in results:
    print("name={}; run_id={}; version={}".format(res.name, res.run_id, res.version))

# Get the version of the model filtered by run_id
run_id = "e14afa2f47a040728060c1699968fd43"
filter_string = "run_id='{}'".format(run_id)
results = client.search_model_versions(filter_string)
print("-" * 80)
for res in results:
    print("name={}; run_id={}; version={}".format(res.name, res.run_id, res.version))
Output
------------------------------------------------------------------------------------
name=CordobaWeatherForecastModel; run_id=eaef868ee3d14d10b4299c4c81ba8814; version=1
name=CordobaWeatherForecastModel; run_id=e14afa2f47a040728060c1699968fd43; version=2
------------------------------------------------------------------------------------
name=CordobaWeatherForecastModel; run_id=e14afa2f47a040728060c1699968fd43; version=2
search_registered_models(filter_string=None, max_results=100, order_by=None, page_token=None)[source]

Search for registered models in backend that satisfy the filter criteria.

Parameters
  • filter_string – Filter query string, defaults to searching all registered models. Currently, it supports only a single filter condition as the name of the model, for example, name = 'model_name' or a search expression to match a pattern in the registered model name. For example, name LIKE 'Boston%' (case sensitive) or name ILIKE '%boston%' (case insensitive).

  • max_results – Maximum number of registered models desired.

  • order_by – List of column names with ASC|DESC annotation, to be used for ordering matching search results.

  • page_token – Token specifying the next page of results. It should be obtained from a search_registered_models call.

Returns

A PagedList of mlflow.entities.model_registry.RegisteredModel objects that satisfy the search expressions. The pagination token for the next page can be obtained via the token attribute of the object.

Example
import mlflow
from mlflow.tracking import MlflowClient

client = MlflowClient()

# Get search results filtered by the registered model name
model_name="CordobaWeatherForecastModel"
filter_string = "name='{}'".format(model_name)
results = client.search_registered_models(filter_string=filter_string)
print("-" * 80)
for res in results:
    for mv in res.latest_versions:
        print("name={}; run_id={}; version={}".format(mv.name, mv.run_id, mv.version))

# Get search results filtered by the registered model name that matches
# prefix pattern
filter_string = "name LIKE 'Boston%'"
results = client.search_registered_models(filter_string=filter_string)
for res in results:
    for mv in res.latest_versions:
    print("name={}; run_id={}; version={}".format(mv.name, mv.run_id, mv.version))

# Get all registered models and order them by ascending order of the names
results = client.search_registered_models(order_by=["name ASC"])
print("-" * 80)
for res in results:
    for mv in res.latest_versions:
        print("name={}; run_id={}; version={}".format(mv.name, mv.run_id, mv.version))
Output
------------------------------------------------------------------------------------
name=CordobaWeatherForecastModel; run_id=eaef868ee3d14d10b4299c4c81ba8814; version=1
name=CordobaWeatherForecastModel; run_id=e14afa2f47a040728060c1699968fd43; version=2
------------------------------------------------------------------------------------
name=BostonWeatherForecastModel; run_id=ddc51b9407a54b2bb795c8d680e63ff6; version=1
name=BostonWeatherForecastModel; run_id=48ac94350fba40639a993e1b3d4c185d; version=2
-----------------------------------------------------------------------------------
name=AzureWeatherForecastModel; run_id=5fcec6c4f1c947fc9295fef3fa21e52d; version=1
name=AzureWeatherForecastModel; run_id=8198cb997692417abcdeb62e99052260; version=3
name=BostonWeatherForecastModel; run_id=ddc51b9407a54b2bb795c8d680e63ff6; version=1
name=BostonWeatherForecastModel; run_id=48ac94350fba40639a993e1b3d4c185d; version=2
name=CordobaWeatherForecastModel; run_id=eaef868ee3d14d10b4299c4c81ba8814; version=1
name=CordobaWeatherForecastModel; run_id=e14afa2f47a040728060c1699968fd43; version=2
search_runs(experiment_ids, filter_string='', run_view_type=1, max_results=1000, order_by=None, page_token=None)[source]

Search experiments that fit the search criteria.

Parameters
  • experiment_ids – List of experiment IDs, or a single int or string id.

  • filter_string – Filter query string, defaults to searching all runs.

  • run_view_type – one of enum values ACTIVE_ONLY, DELETED_ONLY, or ALL runs defined in mlflow.entities.ViewType.

  • max_results – Maximum number of runs desired.

  • order_by – List of columns to order by (e.g., “metrics.rmse”). The order_by column can contain an optional DESC or ASC value. The default is ASC. The default ordering is to sort by start_time DESC, then run_id.

  • page_token – Token specifying the next page of results. It should be obtained from a search_runs call.

Returns

A list of mlflow.entities.Run objects that satisfy the search expressions. If the underlying tracking store supports pagination, the token for the next page may be obtained via the token attribute of the returned object.

Example
import mlflow
from mlflow.tracking import MlflowClient
from mlflow.entities import ViewType

def print_run_info(runs):
    for r in runs:
        print("run_id: {}".format(r.info.run_id))
        print("lifecycle_stage: {}".format(r.info.lifecycle_stage))
        print("metrics: {}".format(r.data.metrics))

        # Exclude mlflow system tags
        tags = {k: v for k, v in r.data.tags.items() if not k.startswith("mlflow.")}
        print("tags: {}".format(tags))

# Create an experiment and log two runs with metrics and tags under the experiment
experiment_id = mlflow.create_experiment("Social NLP Experiments")
with mlflow.start_run(experiment_id=experiment_id) as run:
    mlflow.log_metric("m", 1.55)
    mlflow.set_tag("s.release", "1.1.0-RC")
with mlflow.start_run(experiment_id=experiment_id):
    mlflow.log_metric("m", 2.50)
    mlflow.set_tag("s.release", "1.2.0-GA")

# Search all runs under experiment id and order them by
# descending value of the metric 'm'
client = MlflowClient()
runs = client.search_runs(experiment_id, order_by=["metrics.m DESC"])
print_run_info(runs)
print("--")

# Delete the first run
client.delete_run(run_id=run.info.run_id)

# Search only deleted runs under the experiment id and use a case insensitive pattern
# in the filter_string for the tag.
filter_string = "tags.s.release ILIKE '%rc%'"
runs = client.search_runs(experiment_id, run_view_type=ViewType.DELETED_ONLY,
                            filter_string=filter_string)
print_run_info(runs)
Output
run_id: 0efb2a68833d4ee7860a964fad31cb3f
lifecycle_stage: active
metrics: {'m': 2.5}
tags: {'s.release': '1.2.0-GA'}
run_id: 7ab027fd72ee4527a5ec5eafebb923b8
lifecycle_stage: active
metrics: {'m': 1.55}
tags: {'s.release': '1.1.0-RC'}
--
run_id: 7ab027fd72ee4527a5ec5eafebb923b8
lifecycle_stage: deleted
metrics: {'m': 1.55}
tags: {'s.release': '1.1.0-RC'}
set_experiment_tag(experiment_id, key, value)[source]

Set a tag on the experiment with the specified ID. Value is converted to a string.

Parameters
  • experiment_id – String ID of the experiment.

  • key – Name of the tag.

  • value – Tag value (converted to a string).

Example
from mlflow.tracking import MlflowClient

# Create an experiment and set its tag
client = MlflowClient()
experiment_id = client.create_experiment("Social Media NLP Experiments")
client.set_experiment_tag(experiment_id, "nlp.framework", "Spark NLP")

# Fetch experiment metadata information
experiment = client.get_experiment(experiment_id)
print("Name: {}".format(experiment.name))
print("Tags: {}".format(experiment.tags))
Output
Name: Social Media NLP Experiments
Tags: {'nlp.framework': 'Spark NLP'}
set_model_version_tag(name, version, key, value)[source]

Set a tag for the model version.

Parameters
  • name – Registered model name.

  • version – Registered model version.

  • key – Tag key to log.

  • value – Tag value to log.

Returns

None

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_model_version_info(mv):
    print("Name: {}".format(mv.name))
    print("Version: {}".format(mv.version))
    print("Tags: {}".format(mv.tags))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
# and set a tag
model_uri = "runs:/{}/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id)
print_model_version_info(mv)
print("--")
client.set_model_version_tag(name, mv.version, "t", "1")
mv = client.get_model_version(name, mv.version)
print_model_version_info(mv)
Output
Name: RandomForestRegression
Version: 1
Tags: {}
--
Name: RandomForestRegression
Version: 1
Tags: {'t': '1'}
set_registered_model_tag(name, key, value)[source]

Set a tag for the registered model.

Parameters
  • name – Registered model name.

  • key – Tag key to log.

  • value – Tag value log.

Returns

None

Example
import mlflow
from mlflow.tracking import MlflowClient

def print_model_info(rm):
    print("--")
    print("name: {}".format(rm.name))
    print("tags: {}".format(rm.tags))

name = "SocialMediaTextAnalyzer"
tags = {"nlp.framework1": "Spark NLP"}
mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()

# Create registered model, set an additional tag, and fetch
# update model info
client.create_registered_model(name, tags, desc)
model = client.get_registered_model(name)
print_model_info(model)

client.set_registered_model_tag(name, "nlp.framework2", "VADER")
model = client.get_registered_model(name)
print_model_info(model)
Output
--
name: SocialMediaTextAnalyzer
tags: {'nlp.framework1': 'Spark NLP'}
--
name: SocialMediaTextAnalyzer
tags: {'nlp.framework1': 'Spark NLP', 'nlp.framework2': 'VADER'}
set_tag(run_id, key, value)[source]

Set a tag on the run with the specified ID. Value is converted to a string.

Parameters
  • run_id – String ID of the run.

  • key – Name of the tag.

  • value – Tag value (converted to a string)

Example
from mlflow.tracking import MlflowClient

def print_run_info(run):
    print("run_id: {}".format(run.info.run_id))
    print("Tags: {}".format(run.data.tags))

# Create a run under the default experiment (whose id is '0').
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
print_run_info(run)
print("--")

# Set a tag and fetch updated run info
client.set_tag(run.info.run_id, "nlp.framework", "Spark NLP")
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: 4f226eb5758145e9b28f78514b59a03b
Tags: {}
--
run_id: 4f226eb5758145e9b28f78514b59a03b
Tags: {'nlp.framework': 'Spark NLP'}
set_terminated(run_id, status=None, end_time=None)[source]

Set a run’s status to terminated.

Parameters
  • status – A string value of mlflow.entities.RunStatus. Defaults to “FINISHED”.

  • end_time – If not provided, defaults to the current time.

Example
from mlflow.tracking import MlflowClient

def print_run_info(r):
    print("run_id: {}".format(r.info.run_id))
    print("status: {}".format(r.info.status))

# Create a run under the default experiment (whose id is '0').
# Since this is low-level CRUD operation, this method will create a run.
# To end the run, you'll have to explicitly terminate it.
client = MlflowClient()
experiment_id = "0"
run = client.create_run(experiment_id)
print_run_info(run)
print("--")

# Terminate the run and fetch updated status. By default,
# the status is set to "FINISHED". Other values you can
# set are "KILLED", "FAILED", "RUNNING", or "SCHEDULED".
client.set_terminated(run.info.run_id, status="KILLED")
run = client.get_run(run.info.run_id)
print_run_info(run)
Output
run_id: 575fb62af83f469e84806aee24945973
status: RUNNING
--
run_id: 575fb62af83f469e84806aee24945973
status: KILLED
transition_model_version_stage(name, version, stage, archive_existing_versions=False)[source]

Update model version stage.

Parameters
  • name – Registered model name.

  • version – Registered model version.

  • stage – New desired stage for this model version.

  • archive_existing_versions – If this flag is set to True, all existing model versions in the stage will be automically moved to the “archived” stage. Only valid when stage is "staging" or "production" otherwise an error will be raised.

Returns

A single mlflow.entities.model_registry.ModelVersion object.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_model_version_info(mv):
    print("Name: {}".format(mv.name))
    print("Version: {}".format(mv.version))
    print("Description: {}".format(mv.description))
    print("Stage: {}".format(mv.current_stage))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
desc = "A new version of the model using ensemble trees"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

 # Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

 # Create a new version of the rfr model under the registered model name
model_uri = "runs:/{}/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id, description=desc)
print_model_version_info(mv)
print("--")

# transition model version from None -> staging
mv = client.transition_model_version_stage(name, mv.version, "staging")
print_model_version_info(mv)
Output
Name: RandomForestRegression
Version: 1
Description: A new version of the model using ensemble trees
Stage: None
--
Name: RandomForestRegression
Version: 1
Description: A new version of the model using ensemble trees
Stage: Staging
update_model_version(name, version, description=None)[source]

Update metadata associated with a model version in backend.

Parameters
  • name – Name of the containing registered model.

  • version – Version number of the model version.

  • description – New description.

Returns

A single mlflow.entities.model_registry.ModelVersion object.

Example
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.ensemble import RandomForestRegressor

def print_model_version_info(mv):
    print("Name: {}".format(mv.name))
    print("Version: {}".format(mv.version))
    print("Description: {}".format(mv.description))

mlflow.set_tracking_uri("sqlite:///mlruns.db")
params = {"n_estimators": 3, "random_state": 42}
name = "RandomForestRegression"
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])

# Log MLflow entities
with mlflow.start_run() as run:
    mlflow.log_params(params)
    mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

# Register model name in the model registry
client = MlflowClient()
client.create_registered_model(name)

# Create a new version of the rfr model under the registered model name
model_uri = "runs:/{}/sklearn-model".format(run.info.run_id)
mv = client.create_model_version(name, model_uri, run.info.run_id)
print_model_version_info(mv)
print("--")

# Update model version's description
desc = "A new version of the model using ensemble trees"
mv = client.update_model_version(name, mv.version, desc)
print_model_version_info(mv)
Output
Name: RandomForestRegression
Version: 1
Description: None
--
Name: RandomForestRegression
Version: 1
Description: A new version of the model using ensemble trees
update_registered_model(name, description=None)[source]

Updates metadata for RegisteredModel entity. Input field description should be non-None. Backend raises exception if a registered model with given name does not exist.

Parameters
  • name – Name of the registered model to update.

  • description – (Optional) New description.

Returns

A single updated mlflow.entities.model_registry.RegisteredModel object.

Example
def print_registered_model_info(rm):
    print("name: {}".format(rm.name))
    print("tags: {}".format(rm.tags))
    print("description: {}".format(rm.description))

name = "SocialMediaTextAnalyzer"
tags = {"nlp.framework": "Spark NLP"}
desc = "This sentiment analysis model classifies the tone-happy, sad, angry."

mlflow.set_tracking_uri("sqlite:///mlruns.db")
client = MlflowClient()
client.create_registered_model(name, tags, desc)
print_registered_model_info(client.get_registered_model(name))
print("--")

# Update the model's description
desc = "This sentiment analysis model classifies tweets' tone: happy, sad, angry."
client.update_registered_model(name, desc)
print_registered_model_info(client.get_registered_model(name))
Output
name: SocialMediaTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies the tone-happy, sad, angry.
--
name: SocialMediaTextAnalyzer
tags: {'nlp.framework': 'Spark NLP'}
description: This sentiment analysis model classifies tweets' tone: happy, sad, angry.
mlflow.tracking.get_registry_uri()[source]

Get the current registry URI. If none has been specified, defaults to the tracking URI.

Returns

The registry URI.

Example
# Get the current model registry uri
mr_uri = mlflow.get_registry_uri()
print("Current model registry uri: {}".format(mr_uri))

# Get the current tracking uri
tracking_uri = mlflow.get_tracking_uri()
print("Current tracking uri: {}".format(tracking_uri))

# They should be the same
assert mr_uri == tracking_uri
Output
Current model registry uri: file:///.../mlruns
Current tracking uri: file:///.../mlruns
mlflow.tracking.get_tracking_uri()[source]

Get the current tracking URI. This may not correspond to the tracking URI of the currently active run, since the tracking URI can be updated via set_tracking_uri.

Returns

The tracking URI.

Example
import mlflow

# Get the current tracking uri
tracking_uri = mlflow.get_tracking_uri()
print("Current tracking uri: {}".format(tracking_uri))
Output
Current tracking uri: file:///.../mlruns
mlflow.tracking.is_tracking_uri_set()[source]

Returns True if the tracking URI has been set, False otherwise.

mlflow.tracking.set_registry_uri(uri)[source]

Set the registry server URI. This method is especially useful if you have a registry server that’s different from the tracking server.

Parameters

uri

  • An empty string, or a local file path, prefixed with file:/. Data is stored locally at the provided file (or ./mlruns if empty).

  • An HTTP URI like https://my-tracking-server:5000.

  • A Databricks workspace, provided as the string “databricks” or, to use a Databricks CLI profile, “databricks://<profileName>”.

Example
import mflow

# Set model registry uri, fetch the set uri, and compare
# it with the tracking uri. They should be different
mlflow.set_registry_uri("sqlite:////tmp/registry.db")
mr_uri = mlflow.get_registry_uri()
print("Current registry uri: {}".format(mr_uri))
tracking_uri = mlflow.get_tracking_uri()
print("Current tracking uri: {}".format(tracking_uri))

# They should be different
assert tracking_uri != mr_uri
Output
Current registry uri: sqlite:////tmp/registry.db
Current tracking uri: file:///.../mlruns
mlflow.tracking.set_tracking_uri(uri)[source]

Set the tracking server URI. This does not affect the currently active run (if one exists), but takes effect for successive runs.

Parameters

uri

  • An empty string, or a local file path, prefixed with file:/. Data is stored locally at the provided file (or ./mlruns if empty).

  • An HTTP URI like https://my-tracking-server:5000.

  • A Databricks workspace, provided as the string “databricks” or, to use a Databricks CLI profile, “databricks://<profileName>”.

Example
import mlflow

mlflow.set_tracking_uri("file:///tmp/my_tracking")
tracking_uri = mlflow.get_tracking_uri()
print("Current tracking uri: {}".format(tracking_uri))
Output
Current tracking uri: file:///tmp/my_tracking