Query Traces

This page describes how to query traces logged to MLflow in both UI and API.

Filtering Traces on UI
Filtering Traces on API

Use the search box in the MLflow Trace UI to filter traces by various criteria.

The mlflow.search_traces() API allows you to programmatically search for traces across experiments:

import mlflow

# Search traces in the current experiment
traces = mlflow.search_traces(filter_string="status = 'ERROR'")

# Search traces in specific experiments
traces = mlflow.search_traces(
    experiment_ids=["experiment_id_1", "experiment_id_2"],
    filter_string="name = 'predict'",
)

See the Filter Syntax section below for more details on filtering options.

Filter Syntax

The filter_string argument allows you to create powerful search queries using a SQL-like syntax. Here are the most common filter patterns:

Filter by Name

Search for traces by their name:

mlflow.search_traces(filter_string="trace.name = 'predict'")

Filter by Timestamp

Search traces created within a specific time range:

# Get traces created after a specific timestamp (in milliseconds)
timestamp = int(time.time() * 1000)
mlflow.search_traces(
    filter_string=f"trace.timestamp > {timestamp - 3600000}"  # Last hour
)

Filter by Status

Search for traces by their execution status:

# Get successful traces
mlflow.search_traces(filter_string="trace.status = 'OK'")

# Get failed traces
mlflow.search_traces(filter_string="trace.status = 'ERROR'")

Trace status must be one of ["OK", "ERROR", "IN_PROGRESS"].

Filter by Tags

Filter traces that have specific tag values:

mlflow.search_traces(filter_string="tag.model_name = 'gpt-4'")

Combine Multiple Conditions

Combine multiple filters using logical operators:

mlflow.search_traces(filter_string="trace.status = 'OK' AND tag.importance = 'high'")

Find Traces associated with an MLflow Run

To view traces associated with a specific run:

Navigate to the run details page
Click on the "Traces" tab to see all traces associated with that run

Use the run_id parameter to search for traces associated with a specific run:

import mlflow

# Search traces associated with a specific run
traces = mlflow.search_traces(run_id="run_id_123456")

Searching Traces via API

This section provides a detailed guide on using the mlflow.search_traces() API for programmatic trace retrieval.

First, let's create some sample traces to demonstrate the search functionality:

import time
import mlflow
from mlflow.entities import SpanType


# Define methods to be traced
@mlflow.trace(span_type=SpanType.TOOL, attributes={"time": "morning"})
def morning_greeting(name: str):
    time.sleep(1)
    mlflow.update_current_trace(tags={"person": name})
    return f"Good morning {name}."


@mlflow.trace(span_type=SpanType.TOOL, attributes={"time": "evening"})
def evening_greeting(name: str):
    time.sleep(1)
    mlflow.update_current_trace(tags={"person": name})
    return f"Good evening {name}."


@mlflow.trace(span_type=SpanType.TOOL)
def goodbye():
    raise Exception("Cannot say goodbye")


# Execute the methods within different experiments
morning_experiment = mlflow.set_experiment("Morning Experiment")
morning_greeting("Tom")

# Get the timestamp in milliseconds
morning_time = int(time.time() * 1000)

evening_experiment = mlflow.set_experiment("Evening Experiment")
experiment_ids = [morning_experiment.experiment_id, evening_experiment.experiment_id]
evening_greeting("Mary")
goodbye()

Return Types

The mlflow.search_traces() API can return results in two formats, controlled by the return_type parameter:

# Default: Return as Pandas DataFrame
trace_df = mlflow.search_traces(experiment_ids=[morning_experiment.experiment_id])

# Return as list of Trace objects
trace_list = mlflow.search_traces(
    experiment_ids=[morning_experiment.experiment_id], return_type="list"
)

note

The return_type parameter is only available in MLflow version 2.21.1 and later. If you are using an older version, the mlflow.search_traces() API will always return a Pandas DataFrame. To get a list of Trace objects, you can use the MlflowClient.search_traces() API instead.

DataFrame Schema

When using return_type="pandas" (default), the returned DataFrame includes these columns:

request_id: A primary identifier of a trace
trace: A trace object
timestamp_ms: The start time of the trace in milliseconds
status: The status of the trace
execution_time_ms: The duration of the trace in milliseconds
request: The input to the traced logic
response: The output of the traced logic
request_metadata: Key-value pairs associated with the trace
spans: Spans in the trace
tags: Tags associated with the trace

Order Traces

You can order the search results using the order_by parameter:

mlflow.search_traces(
    experiment_ids=experiment_ids, order_by=["timestamp_ms DESC"]  # Most recent first
)

Pass multiple columns to the order_by parameter to sort by multiple fields:

mlflow.search_traces(
    experiment_ids=experiment_ids, order_by=["timestamp_ms DESC", "status ASC"]
)

Unpack Span Fields into DataFrame

When using the DataFrame return type, you can extract specific fields from spans into separate columns using the extract_fields parameter:

traces = mlflow.search_traces(
    extract_fields=[
        # Extract the "name" field in the "morning_greeting" span inputs
        "morning_greeting.inputs.name",
        # Extract all output fields in the "morning_greeting" span
        "morning_greeting.outputs",
    ],
    experiment_ids=[morning_experiment.experiment_id],
)

print(traces)

The output Pandas DataFrame will contain additional columns for the extracted span fields:

    request_id                              ...     morning_greeting.inputs.name   morning_greeting.outputs
0   053adf2f5f5e4ad68d432e06e254c8a4        ...     'Tom'                          'Good morning Tom.'

This feature is particularly useful for creating evaluation datasets:

eval_data = traces.rename(
    columns={
        "morning_greeting.inputs.name": "inputs",
        "morning_greeting.outputs": "ground_truth",
    }
)
results = mlflow.evaluate(
    model,
    eval_data,
    targets="ground_truth",
    model_type="question-answering",
)

note

The extract_fields parameter is only supported when using return_type="pandas".

Pagination

For large result sets, use the MlflowClient.search_traces() API for pagination:

from mlflow import MlflowClient

client = MlflowClient()
page_token = None

while True:
    results = client.search_traces(
        experiment_ids=experiment_ids,
        page_token=page_token,
        max_results=100,  # Number of results per page
    )

    # Process the current page of results
    for trace in results:
        # Do something with each trace
        print(trace.request_id)

    # Check if there are more pages
    if not results.token:
        break

    page_token = results.token

Filter Syntax​

Filter by Name​

Filter by Timestamp​

Filter by Status​

Filter by Tags​

Combine Multiple Conditions​

Find Traces associated with an MLflow Run​

Searching Traces via API​

Return Types​

DataFrame Schema​

Order Traces​

Unpack Span Fields into DataFrame​

Pagination​