MLflow 3.9.0rc0
We're excited to announce MLflow 3.9.0rc0, a pre-release including several notable updates:
Major New Features:
- ๐ฎ MLflow Assistant: Figuring out the next steps to debug your apps and agents can be challenging. We're excited to introduce the MLflow Assistant, an in-product chatbot that can help you identify, diagnose, and fix issues. The assistant is backed by Claude Code, and directly passes context from the MLflow UI to Claude. Click on the floating "Assistant" button in the bottom right of the MLflow UI to get started!
- ๐ Trace Overview Dashboard: You can now get insights into your agent's performance at a glance with the new "Overview" tab in GenAI experiments. Many pre-built statistics are available out of the box, including performance metrics (e.g. latency, request count), quality metrics (based on assessments), and tool call summaries. If there are any additional charts you'd like to see, please feel free to raise an issue in the MLflow repository!
- โจ AI Gateway: We're revamping our AI Gateway feature! AI Gateway provides a unified interface for your API requests, allowing you to route queries to your LLM provider(s) of choice. In MLflow 3.9.0rc0, the Gateway server is now located directly in the tracking server, so you don't need to spin up a new process. Additional features such as passthrough endpoints, traffic splits, and fallback models are also available, with more to come soon! For more detailed information, please take a look at the docs.
- ๐ Online Monitoring with LLM Judges: Configure LLM judges to automatically run on your traces, without having to write a line of code! You can either use one of our pre-defined judges, or provide your own prompt and instructions to create custom metrics. Head to the new "Judges" tab within the GenAI Experiment UI to get started.
- ๐ค Judge Builder UI: Define and iterate on custom LLM judge prompts directly from the UI! Within the new "Judges" tab, you can create your own prompt for an LLM judge, and test-run it on your traces to see what the output would be. Once you're happy with it, you can either use it for online monitoring (as mentioned above), or use it via the Python SDK for your evals.
- ๐ Distributed Tracing: Trace context can now be propagated across different services and processes, allowing you to truly track request lifecycles from end to end. The related APIs are defined in the
mlflow.tracing.distributedmodule (with more documentation to come soon). - ๐ MemAlign - a new judge optimizer algorithm: We're excited to introduce
MemAlignOptimizer, a new algorithm that makes your judges smarter over time. It learns general guidelines from past feedback while dynamically retrieving relevant examples at runtime, giving you more accurate evaluations.
Stay tuned for the full release, which will be packed with even more features and bugfixes.
To try out this release candidate, please run:
pip install mlflow==3.9.0rc0
Please try it out and report any issues on the issue tracker.



