Tired of Reviewing Traces? Meet Automatic Issue Detection for Your Agent

April 9, 2026 · 4 min read

Observability has become a norm for AI agents in production. But recording logs, metrics, and traces alone doesn't make the user experience better. You need to act on the data.

The problem is that finding actionable insights from massive traces is a needle-in-a-haystack problem. From thousands of production logs, how do you spot conversations where users got frustrated or dropped out? How do you catch an agent that silently returns wrong answers with high confidence and misleads users? Manually reviewing traces one by one doesn't scale.

Today we're announcing Automatic Issue Detection in MLflow, a new AI-driven Insights feature that replaces hours of manual check and triage to just 3 clicks.

Why Do You Need Automatic Issue Detection?

As LLM applications grow in production, maintaining agent quality becomes increasingly challenging. Four problems compound as traffic scales:

Manual review doesn't scale: Inspecting individual traces one by one is unsustainable as request volume grows.
Unclear criteria: It's hard to know which quality dimensions to measure without predefined baselines to start from.
Scattered failure patterns: Related failures spread across thousands of traces with no systematic grouping, making recurring issues easy to miss.
Unstructured tracking: Without formal issue management, identified problems disappear into notes and Slack threads—and regressions go undetected.

Automatic Issue Detection moves teams from reactive, manual debugging to proactive, systematic quality identification.

Getting Started

Issue Detection is built into the MLflow UI and runs directly against the traces you've already collected.

How It Works

The CLEARS Framework

MLflow organizes issue detection across six quality dimensions, forming the CLEARS framework (Correctness, Latency, Execution, Adherence, Relevance, Safety). Choose which categories to focus on based on your application's requirements:

The Detection Pipeline

Once you start an analysis run, MLflow:

Samples and analyzes traces using an LLM of your choice (via MLflow AI Gateway or a direct API connection)
Clusters related problems so you see patterns, not just a flat list of individual failures
Annotates the source traces with specific findings so you can drill straight to the evidence
Generates a summary of key findings, severity distribution, and recommended next steps

The analysis runs asynchronously with real-time progress tracking, so you can kick it off and come back to the results.

In-progress Job

Issue Triage

Detected issues aren't fire-and-forget alerts. Each one can be moved through a structured lifecycle:

Pending — newly surfaced, needs review
Resolved — fix has been deployed and verified
Rejected — investigated and determined not to be a real problem

Issue tracking

You can also edit issue descriptions and adjust severity ratings as your team learns more. This gives you a living record of your application's quality history, not just a snapshot.

Summary

Automatic Issue Detection is available today in MLflow 3.11.1+ as part of MLflow AI Insights. To get started:

Trace your LLM application with MLflow to collect the data Issue Detection needs.
Detect Issues by selecting your CLEARS categories and running analysis against your traces.
Triage and resolve findings through the structured issue lifecycle—Pending, Resolved, or Rejected.

If you find this useful, give us a star on GitHub: github.com/mlflow/mlflow ⭐️

Have questions or feedback? Open an issue or join the Slack channel.

LLMs & Agents

Model Training

LLMs & Agents

Model Training

Cookbook

Ambassador Program

Tired of Reviewing Traces? Meet Automatic Issue Detection for Your Agent

Why Do You Need Automatic Issue Detection?

Getting Started