Using LangChain Agents with MLflow

Welcome to an engaging and educational tutorial designed to dive deep into the world of LangChain agents and their integration with MLflow. This notebook-based tutorial is tailored to offer a practical and comprehensive understanding of LangChain agents and their practical applications with a modest dose of absurdity.

What This Tutorial Covers

  • Understanding LangChain Agents: Gain insights into what LangChain agents are and how they function in complex decision-making scenarios.

  • Introduction to MLflow Integration: Explore how LangChain integrates with MLflow, a powerful tool for the lifecycle management of machine learning models.

Background on LangChain and Agents

  • LangChain Overview: LangChain is a Python framework designed to simplify the creation and deployment of applications powered by language models, particularly in tasks requiring contextual understanding and decision-making.

  • Agents in LangChain: Agents are dynamic components within LangChain that use language models to decide on sequences of actions based on given objectives and available tools. Unlike static workflows, agents can adapt their responses and strategies based on the evolving context.

Integration with MLflow

  • MLflow’s Role: In this tutorial, MLflow plays a crucial role in logging, tracking, and deploying the LangChain agent. You’ll learn how to effectively use MLflow to manage the lifecycle of your LangChain agent, ensuring a streamlined process from development to deployment.

Tutorial Overview

In this tutorial, we will:

  1. Initialize a LangChain Agent: Set up an agent with specific tools and a language model, ready to tackle our chosen task.

  2. Log and Load the Agent Model with MLflow: Demonstrate the process of logging the agent in MLflow and then loading it for execution.

  3. Run a Real-World Prediction: Use the agent to determine how much snow would result from the latest year’s rainfall in Raleigh, NC, and conceptualize the size of a snowman that could be built with it.

By the end of this tutorial, you will have a solid grasp of LangChain agents and MLflow’s capabilities in enhancing these agents’ development and deployment.

Let’s embark on this exciting journey into the realm of LangChain agents and MLflow!

Prerequisites

In our LangChain Agents tutorial, we leverage an important dependency known as SerpAPI. This tool is pivotal in providing our LangChain agent with API access to various search engines, crucial for retrieving real-time data from the web.

Installing SerpAPI

To use SerpAPI, you’ll need to install it via pip. It’s a straightforward process, and the package is named google-search-results. Once installed, you can create an account at SerpAPI’s Official Website and retrieve an API Key. In order to use this integration with LangChain, you can simply load the toolset via load_tools(["serpapi"]) to allow the LangChain agent access to the Google Search APIs.

pip install google-search-results

As we move forward in the tutorial, we will explore how to effectively use SerpAPI to empower our LangChain agent.

[1]:
import os

from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI

import mlflow
from mlflow.models import infer_signature

assert "OPENAI_API_KEY" in os.environ, "Please set the OPENAI_API_KEY environment variable."
assert "SERPAPI_API_KEY" in os.environ, "Please set the SERPAPI_API_KEY environment variable."

Setting Up the LangChain Agent with Tools and OpenAI LLM

In this part of the tutorial, we delve into the initialization of a LangChain agent, a key step in building our application. This process involves configuring the language model and defining the tools that the agent will utilize to perform its tasks.

Language Model Configuration

  • OpenAI LLM: We initialize the OpenAI language model (llm) with a specific temperature setting. The temperature parameter controls the randomness of the language model’s responses, with a lower value leading to more predictable and conservative outputs.

[2]:
# NOTE: The LangChain OpenAI wrapper uses `text-davinci-003` as the default model type as of version 0.0.331
llm = OpenAI(model="text-davinci-003", temperature=0)

Defining Tools for the Agent

  • Tool Selection: The agent is equipped with two primary tools – a web search engine interface (serpapi) and a math execution engine (llm-math). These tools are crucial for enabling the agent to retrieve information from the web and perform mathematical computations, respectively.

  • Tool Loading: The load_tools function is used to load these tools and associate them with our language model. This integration ensures that the agent has the necessary capabilities to handle complex queries that involve both data retrieval and quantitative analysis.

[3]:
# Define the two tools that the agent will use: a web search engine interface and a math execution engine.
tools = load_tools(["serpapi", "llm-math"], llm=llm)

Agent Initialization

  • Agent Type: We create the agent using the initialize_agent function and specify the type of agent as ZERO_SHOT_REACT_DESCRIPTION. This particular type of agent is designed to react and provide descriptions based on zero-shot learning, meaning it can understand and respond to tasks without prior specific training on them.

  • Verbose Mode: The agent is initialized in verbose mode, which allows us to see detailed logs of its decision-making process. This feature is particularly useful for understanding how the agent interprets and responds to different inputs.

Through this setup, we have laid the foundation for an intelligent LangChain agent capable of handling a variety of tasks by leveraging the power of language models and specific functional tools. As we progress, we will see this agent in action, demonstrating its potential in practical applications.

[4]:
# Create the agent
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)

Logging and Loading the LangChain Agent Using MLflow

In this crucial phase of our tutorial, we focus on the integration of the LangChain agent with MLflow. This process involves logging the agent model in MLflow and subsequently loading it for use.

Logging the Agent in MLflow

  • Starting an MLflow Run: We initiate an MLflow run, a key step in MLflow’s tracking system, which is used to organize and manage model development activities.

  • Model Logging: The agent is logged into MLflow using the mlflow.langchain.log_model function. This function not only saves the agent model but also tags it with a specified name, in our case, “search-calculation”.

Loading the Model for Use

  • Model Retrieval: Post logging, we load the model using MLflow’s pyfunc.load_model function. This step is essential for activating the agent model, making it ready for executing predictions or tasks.

  • Model URI: The model_info.model_uri provides the unique identifier for the logged model, ensuring that we are loading the correct version of the agent model for our application.

Prepartion for Logging

A good practice to follow, prior to logging a model, is to validate the model’s input and output signature. This can be used to explicitly define the expected signature during model logging to ensure that inference with the model is validated properly.

We can define the signature by passing in an example input to our model and generating a signature inference based on these values.

[5]:
input_example = {
    "input": "How many apples can fit inside a standard 8 cubic meter cement mixer truck's drum?"
}

prediction = agent.run(input_example)

signature = infer_signature(input_example, prediction)

signature


> Entering new AgentExecutor chain...
 I need to know the size of an apple and the size of the drum
Action: Search
Action Input: "size of an apple"
Observation: Commercial growers aim to produce an apple that is 7 to 8.5 cm (2+3⁄4 to 3+1⁄4 in) in diameter, due to market preference. Some consumers, especially in Japan, prefer a larger apple, while apples less than 5.5 cm (2+1⁄4 in) are generally used for juicing and have little fresh market value.
Thought: I need to know the size of a cement mixer truck's drum
Action: Search
Action Input: "size of a cement mixer truck's drum"
Observation: Truck mixers are the most common mode of PCC transport. They consist of a truck-mounted drum that rotates on an inclined axis. A typical mixing truck (Figures 1 and 2) uses a 6.9 – 9.2 m3 (9 – 12 yd3) mixing drum, the size being limited due to gross vehicle weight of the loaded truck.
Thought: I now know the size of an apple and the size of the drum
Action: Calculator
Action Input: 8 cubic meters divided by 0.0077 cubic meters
Observation: Answer: 1038.9610389610389
Thought: I now know the final answer
Final Answer: Approximately 1039 apples can fit inside a standard 8 cubic meter cement mixer truck's drum.

> Finished chain.
[5]:
inputs:
  ['input': string]
outputs:
  [string]
params:
  None

While the math is off by about an order of magnitude, we do have our input and output signature, which we will use when logging the model.

[6]:
mlflow.set_experiment("Google Agent")

with mlflow.start_run():
    model_info = mlflow.langchain.log_model(
        lc_model=agent,
        artifact_path="search-calculation",
        input_example=input_example,
        signature=signature,
        extra_pip_requirements=["google-search-results==2.4.2"],
    )

# Load our agent model for use
loaded_model = mlflow.pyfunc.load_model(model_info.model_uri)
2024/01/04 21:54:50 INFO mlflow.tracking.fluent: Experiment with name 'Google Agent' does not exist. Creating a new experiment.

If we open up the MLflow UI, we can see our model with our signature and our input example that we defined.

Google Agent in the MLflow UI

LangChain Agent Prediction and Result Analysis

In this segment of the tutorial, we utilize our loaded LangChain agent model to make a prediction. The agent is tasked with determining the volume of snow resulting from rainfall in Raleigh, NC, and conceptualizing the size of a snowman that could be built from this snow.

Agent Prediction Task

The agent receives a complex query that involves several steps: - Determining the most recent year’s rainfall in Raleigh, NC. - Converting this rainfall amount into snow, using a depth conversion ratio of snow to rain. - Calculating the total volume of snow for 1 acre of land. - Estimating the size of a snowman that could be made from this volume of snow.

Result Analysis

The agent’s output demonstrates its ability to: - Navigate through complex queries involving multiple steps. - Integrate search and calculation tools to derive accurate and relevant information. - Translate quantitative data into a practical and imaginative scenario.

Conclusion of Agent’s Task

The agent successfully completes its task by providing a comprehensive answer, showcasing the practical utility of LangChain agents in processing and synthesizing information to answer multifaceted queries. This example illustrates the power of integrating advanced language models with specialized tools to solve real-world problems in creative and informative ways.

[7]:
agent_response = loaded_model.predict(
    [
        {
            "input": "How much rain has fallen in Raleigh, NC for the most recent year? "
            "If that rain fell as snow, which has a depth conversion of snow to rain of 10 to 1, what is the total "
            "volume of snow in cubic feet for 1 acre of land?"
            "How big of a snowman could I make if I used all of that snow?"
        }
    ]
)


> Entering new AgentExecutor chain...
 I need to find out the amount of rain that has fallen in Raleigh, NC and then convert that to snow.
Action: Search
Action Input: "Rainfall in Raleigh, NC"
Observation: {'type': 'weather_result', 'temperature': '37', 'unit': 'Fahrenheit', 'precipitation': '0%', 'humidity': '42%', 'wind': '11 mph', 'location': 'Raleigh, NC', 'date': 'Tuesday 10:00 AM', 'weather': 'Sunny'}
Thought: I need to find out the amount of rain that has fallen in Raleigh, NC for the most recent year.
Action: Search
Action Input: "Rainfall in Raleigh, NC for the most recent year"
Observation: ['Get 6-month, year-to-date and 12-month cumulative rainfall totals vs ... 4 NC counties are under alert, including Cumberland and Johnston counties. RDU Rainfall ...', 'Weather in Raleigh for each year since 2010 including average high and low temperatures, total snowfall and precipitation.', 'The bottom portion of the graph indicates the precipitation for the year to date. The yellow line (and green area below it) represents the normal year to date ...', 'North Carolina Cumulative Rainfall Map with. National Weather Service Radar Overlay ; RAINGAGE AT WALNUT CR BUCK JONES RD AT RALEIGH, NC, 0, 0 ; RAINGAGE AT FIRE ...', 'Year Selection. 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000 ...', 'Year Selection. 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011 ... Precipitation (past 24 hours from 05:51:00), 0.00, 5.60, -. Dew ...', 'Raleigh, NC past weather data including previous temperature, barometric pressure, humidity, dew point, rain total, and wind conditions.', 'According to NOAA records, the most precipitation that Raleigh, North Carolina has ever received in a single calendar year is 64.2 inches which occurred in ...', 'Typically, Raleigh averages 41.5 inches of liquid precipitation by the end of the year, while Seattle sees about 34.4 inches. Report a ...', 'Raleigh gets some kind of precipitation, on average, 108 days per year. Precipitation is rain, snow, sleet, or hail that falls to the ground. In order for ...']
Thought: I need to calculate the total volume of snow in cubic feet for 1 acre of land.
Action: Calculator
Action Input: 64.2 inches of rain * 10 (snow to rain conversion) * 43560 (square feet in an acre)
Observation: Answer: 27965520.0
Thought: I now know the final answer.
Final Answer: 27965520.0 cubic feet of snow for 1 acre of land. This amount of snow could make a very large snowman.

> Finished chain.
[8]:
print(agent_response[0])
27965520.0 cubic feet of snow for 1 acre of land. This amount of snow could make a very large snowman.

Calculating the Number of Standard Snowmen with LangChain Agent

In this section, we utilize the loaded LangChain agent model to compute the number of standard snowmen that can be made from a given volume of snow. This calculation is an extension of our previous query, demonstrating the agent’s ability to engage in sequential reasoning and complex calculations.

Agent’s Calculation Task

The agent is tasked with a new query: - To determine how many standard snowmen could be made from 27,965,520 cubic feet of snow.

Verbose Mode Output and Analysis

With verbose mode active, we observe the agent’s step-by-step approach:

  • Initial Query Understanding: The agent recognizes the need to determine the volume of a standard snowman.

  • Search for Standard Snowman Volume: It performs a web search to find the volume of a standard snowman.

  • Observation and Calculation: The agent finds relevant data and calculates the volume of a standard snowman based on given dimensions.

  • Final Answer: After obtaining the volume of a single snowman, the agent divides the total snow volume by the volume of one snowman to find the total number of snowmen that can be made.

Agent’s Output

The agent concludes that approximately 327,072.8 standard snowmen could be made from the given volume of snow. This result showcases the agent’s capability to: - Conduct targeted searches and extract pertinent information from various sources. - Perform mathematical calculations and apply them to practical scenarios. - Provide comprehensive and quantitatively accurate responses to multi-faceted queries.

Conclusion of the Agent’s Task

This exercise illustrates the advanced analytical and problem-solving skills of LangChain agents. It highlights the agent’s proficiency in combining search and calculation tools to derive meaningful answers, emphasizing the practical applications of such agents in real-world scenarios.

[9]:
calculated_response = loaded_model.predict(
    [{"input": "How many standard snowmen could I make with 27965520 cubic feet of snow?"}]
)


> Entering new AgentExecutor chain...
 I need to figure out the volume of a standard snowman
Action: Search
Action Input: "volume of a standard snowman"
Observation: ['The perfect snowman calculator uses math & science rules to help you design the snowman of your dreams!', 'The volume of a sphere is (4/3)πr3, so the total volume of the snowman is (4/3)π(1.75)3 + (4/3)π(1.9)3 + (4/3)π(2)3 ≈ 84.7 cubic feet.', "A snowman base 2 meters in diameter would result in a sphere of packed snow amounting to ~3 cubic meters, or roughly 3 tons. Which doesn't count ...", 'The calculations assumed that a “typical snowman” could be modeled by a cylinder with a volume of 40 cubic feet, and that a snowflake could ...', 'What is the total volume of snow used to make the snowman if the head is 12 inches wide, the middle is 16 inches wide and the bottom is 18 ...', 'It snowed in England, and so, I built a snowman :) I built him; measured him and calculated how much snow was used to make him.', 'The challenge is to see which team can create the snowman with the largest estimated volume. ... Standards. Log in. to see state-specific standards (only ...', 'Dr. Anna Szczepanek created the "Snowman Calculator" which lets you put in a few things about your current snowy situation -- how much snow fell ...', 'Dimensions: 5 x 5.625 x 9.5 ; Volume: 12 oz ; How to Order:.', 'The volume of a given shape implies the amount of substance it contains in 3-Dimension. Thus, the volume of the snowman is approximately 23 m³ ...']
Thought: I need to calculate the volume of a standard snowman
Action: Calculator
Action Input: (4/3)π(1.75)3 + (4/3)π(1.9)3 + (4/3)π(2)3
Observation: Answer: 84.69053115669803
Thought: I now know the final answer
Final Answer: 327072.8 snowmen

> Finished chain.
[10]:
print(calculated_response[0])
327072.8 snowmen

That’s a lot of snowmen.

snowmen

Or, as the LLM put it, “This amount of snow could make a very large snowman.”

snowman

Indeed.

Note: Images created with the use of DALL·E 3

Concluding Our LangChain Agents and MLflow Tutorial

As we wrap up this fun tutorial, let’s take a moment to reflect on the fascinating journey we’ve embarked on with LangChain agents and MLflow. We ventured through the realms of advanced language model applications, culminating in a whimsical yet insightful exploration of snowmen creation – a topic every child (and child-at-heart) has pondered!

Key Takeaways from the Tutorial

  • Versatility of LangChain Agents: We’ve seen firsthand how LangChain agents can navigate complex tasks, from gathering real-time data to performing intricate calculations. Their ability to process and synthesize information from diverse sources is nothing short of impressive.

  • Power of MLflow in Model Management: MLflow has proven to be an invaluable asset in our journey. Its capabilities in tracking, logging, and deploying LangChain models have streamlined our workflow, making the management of complex machine learning processes more accessible and efficient.

  • Seamless Integration: The integration of LangChain with MLflow has demonstrated how two powerful tools can come together to create a robust framework for developing and deploying AI-driven applications.

Wrapping Up with a Smile

As we conclude, remember that the world of LangChain agents and MLflow is as vast as it is fascinating. Whether you’re calculating snowmen, predicting weather patterns, or performing LLM-assisted competitive market analysis, the opportunities are boundless when it comes to creating powerful and complex agents.

Thank you for joining us on this unique educational adventure. We hope it has sparked your curiosity and inspired you to delve deeper into the capabilities of LangChain and MLflow. Here’s to many more fun and informative explorations in the world of machine learning!

Happy coding, and may your snowmen always be perfectly calculated!

What’s next?

If you’d like to learn more about how MLflow and LangChain integrate, see the other advanced tutorials for MLflow’s LangChain flavor.