MLflow

Skip to main content

8 posts tagged with "agents"

Agent Optimization Pipeline

Build a tool-calling agent, evaluate it with domain-specific judges, align those judges to expert feedback, and optimize the system prompt with GEPA.

evaluationoptimizationagentsprompts

Tracing and Evaluating a LangGraph Agent

Build a tool-calling travel planning agent with LangGraph, trace every step with MLflow, and evaluate tool selection accuracy.

agentstracingevaluationlanggraph

Evaluating a Multi-Turn Conversational Agent

Evaluate multi-turn customer support chat quality with MLflow's conversational scorers.

agentsevaluationmulti-turn

Tracing and Evaluating OpenAI Agents

Build an e-commerce agent with OpenAI function calling, trace it with MLflow, and evaluate tool selection accuracy.

agentstracingevaluationopenai

Evaluating Databricks Genie Spaces

A complete pipeline for tracing, evaluating, and improving a Databricks Genie space using MLflow.

databricksgenieevaluationtracingagents

Genie Evaluation with LLM Judges

Score Genie traces with built-in and custom judges to find quality issues in responses and SQL generation.

databricksgenieevaluationagents

Genie Space Improvement Generator

Take traces that failed evaluation, combine them with your Genie space config, and generate copy-paste-ready fixes with an LLM.

databricksgenieevaluationagents

Genie Conversation Tracing Pipeline

Pull conversations from a Genie space and log each one as an MLflow trace for inspection and evaluation.

databricksgenietracingagents