8 posts tagged with "agents"
Agent Optimization Pipeline
Build a tool-calling agent, evaluate it with domain-specific judges, align those judges to expert feedback, and optimize the system prompt with GEPA.evaluationoptimizationagentsprompts
Tracing and Evaluating a LangGraph Agent
Build a tool-calling travel planning agent with LangGraph, trace every step with MLflow, and evaluate tool selection accuracy.agentstracingevaluationlanggraph
Evaluating a Multi-Turn Conversational Agent
Evaluate multi-turn customer support chat quality with MLflow's conversational scorers.agentsevaluationmulti-turn
Tracing and Evaluating OpenAI Agents
Build an e-commerce agent with OpenAI function calling, trace it with MLflow, and evaluate tool selection accuracy.agentstracingevaluationopenai
Evaluating Databricks Genie Spaces
A complete pipeline for tracing, evaluating, and improving a Databricks Genie space using MLflow.databricksgenieevaluationtracingagents
Genie Evaluation with LLM Judges
Score Genie traces with built-in and custom judges to find quality issues in responses and SQL generation.databricksgenieevaluationagents
Genie Space Improvement Generator
Take traces that failed evaluation, combine them with your Genie space config, and generate copy-paste-ready fixes with an LLM.databricksgenieevaluationagents
Genie Conversation Tracing Pipeline
Pull conversations from a Genie space and log each one as an MLflow trace for inspection and evaluation.databricksgenietracingagents