Ok, let's dive into LangChain and discover how to build applications with LLMs! 😊
Lesson Goals¶
- Understand fundamental principles of building applications with Large Language Models (LLMs)
- Gain hands-on experience using LangChain to develop LLM-powered solutions
- Learn integration techniques for connecting LLMs with external tools and data sources
- Master the design and implementation of autonomous agents that can perform tasks independently
Autonomous Agents¶
Autonomous agents are software entities that can perform tasks or make decisions without human intervention.
- Autonomous?
- Operate independently, making decisions based on their programming and learned experiences (via environment interactions).
- This is a long-standing research topic in AI
- Recently, LLMs have been used to create agents (perhaps not completely autonomous)
- How can LLMs be conceived as agents?
LLM - In a Nutshell¶
A Language Model is a Machine Learning model which is able to understand and generate human language
- How does it work internally?
- Long story, please look at: 3Blue 1 Brown - How does a Language Model work?
- ... But in a nutshell, it takes a sentence and tries to predict the next token
- Only that?
- Yes (more or less), an LM has no memory, no environment interaction, or anything else
- ... But we can instrument these models to use external tools (or to be chained with each other) to make them powerful agents! 😊
LLMs Panorama: How to Use Them? 🌐🤖¶
Access LLMs through various providers and models:
- API Providers:
- OpenAI, Anthropic (Claude), Google (Gemini), etc.
- Open Source Models (HuggingFace):
- Llama Family, Qwen, Gemma, Mistral, etc.
Despite the differences, they all share the same concept:
- Input: A text prompt (question, instruction, etc.)
- Output: A text response (answer, completion, etc.)
LLM Applications: How to Build Them?¶
- Challenges in Building Applications with LLMs:
- Input Parsing: How to effectively process user inputs?
- Output Formatting: How to structure responses for clarity and usability?
- Error Handling: How to manage unexpected behaviors or failures?
- Conversation Management: How to maintain context in multi-turn interactions?
- External Data Integration: How to connect with databases, APIs, or other data sources?
- Workflow Orchestration: How to seamlessly integrate multiple steps or components?
Solution: This is where LangChain excels!
LangChain - Models¶
LangChain provides a unified interface for interacting with a wide range of Large Language Models (LLMs):
- Supports major providers such as OpenAI, Anthropic, Google, and a variety of open-source models from HuggingFace.
- Enables seamless integration of different LLMs within a single application.
- Includes support for embeddings, which are essential for tasks like semantic search, text similarity, and information retrieval.
LLMs providers¶
- An LLM is essentially a function that takes a string as input and returns a string as output.
- The main method to interact with an LLM is the
invokemethod: - This provides a uniform interface to interact with LLMs, regardless of the provider or model used.
# import langchain gemini and ollama
from langchain_ollama.llms import OllamaLLM
qwen = OllamaLLM(model="qwen2.5:1.5b")
# Example query
qwen.invoke("Explain machine learning in simple terms (max 50 words)")
"Machine Learning is when computers get better at tasks without being explicitly programmed to do so. It's like training a computer to recognize pictures of cats by showing it many images, and over time, the computer starts recognizing other similar cats too. This way, it learns patterns in data that can be used for various applications, from identifying objects in videos to predicting weather trends!"
Gemini¶
- Gemini is a chat-oriented language model developed by Google.
- Designed for multi-turn conversations and advanced dialogue management.
- Suitable for tasks requiring context retention and conversational flow.
- For usage details, refer to Google AI Studio.
# Same for gemini
from dotenv import load_dotenv # used to load environment variables
load_dotenv() # Load environment variables from .env file
from langchain_google_genai import GoogleGenerativeAI
gemini = GoogleGenerativeAI(model="gemini-2.0-flash", temperature=0.0)
gemini.invoke("Explain machine learning in simple terms (max 50 words)") # Example query
'Machine learning teaches computers to learn from data without explicit programming. It identifies patterns and makes predictions or decisions based on that learning, improving its accuracy over time with more data.'
Base Language Model¶
from langchain_core.language_models.base import BaseLanguageModel
def explain_topic(llm: BaseLanguageModel, topic: str) -> str:
""" A general explain topic functionality base on an LLM"""
query = f"Explain {topic} in simple terms? (max 50 words)"
response = llm.invoke(query)
return response
explain_topic(gemini, "London"), explain_topic(qwen, "London")
("London is the capital of England and the United Kingdom. It's a huge, diverse city with a rich history, famous landmarks like Big Ben and Buckingham Palace, and a vibrant cultural scene. It's a global hub for finance, fashion, and the arts.",
"London is the capital city of England and one of the world's largest cities with over eight million residents. It sits on the River Thames, surrounded by green parks like Hyde Park. Famous landmarks include Buckingham Palace, Big Ben, Tower Bridge, and many museums. The city offers a mix of historical sites, shopping districts, restaurants, and modern attractions.")
Chat Models¶
- Normal LLMs are seen as simple stateless functions.
- Chat models are more advanced: they can handle conversations with multiple turns.
- How?
- They use a list of messages as input, where each message has a role (user, assistant, system, etc.).
- The
invokemethod is used to send a list of messages to the model and get a response. - Note:
- Even for chat models, the model itself is stateless: the conversation history is not stored by the model—it's up to the user (or application) to manage and provide the history.
Chat Models vs BaseModel¶
| Aspect | Base Model | Chat Model |
|---|---|---|
| Input | Plain text prompt | List of structured messages |
| Designed for | One-shot completions | Multi-turn conversations |
| Output | Text completion | Message object |
| Classes in LangChain | BaseLanguageModel |
BaseChatModel |
# Chat models
from langchain_google_genai import ChatGoogleGenerativeAI
chat_gemini = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0.0)
chat_gemini.invoke("Explain machine learning in simple terms").content[:100] # Example query
'Imagine you\'re teaching a dog a new trick, like "fetch." You don\'t tell the dog exactly how to run, '
Embeddings¶
- Embeddings are numerical representations of text that capture semantic meaning.
- They are specialized language models designed to represent text as vectors, rather than generate text.
- Embeddings are used for various purposes, such as augmenting LLM context, semantic search, clustering, and similarity analysis.
Embeddings - Semantics¶
- What does it mean to capture semantics?
- In the context of embeddings, semantics refers to the meaning and relationships between words or phrases.
- As illustrated in the image below, words with similar meanings are positioned close to each other in the vector space.
- By applying distance or similarity measures, we can identify words or texts that are semantically related.
# Embedings
from langchain_google_genai.embeddings import GoogleGenerativeAIEmbeddings
google_embeddings = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
google_embeddings.embed_query("What is the weather like in London?")[:5]
[-0.024339474737644196, 0.019421163946390152, -0.005040269810706377, -0.021117379888892174, -0.030608780682086945]
Embeddings - Visual Example of Semantic Similarity¶
- The dataset contains a list of questions and answers on three distinct topics.
- For each question and answer, we compute the corresponding embeddings.
- We then use PCA to project the embeddings into two dimensions and visualize whether clusters emerge.
import pandas as pd
df = pd.read_csv("data/dataset.csv", delimiter=";") # Example of reading a CSV file with pandas
df.head()
| Topic | Question | Answer | |
|---|---|---|---|
| 0 | Naruto | Who is the main protagonist of the series? | Naruto Uzumaki |
| 1 | Naruto | What is the name of Naruto's signature ninja t... | Rasengan |
| 2 | Naruto | Which village does Naruto belong to? | Konohagakure (The Village Hidden in the Leaves) |
| 3 | Naruto | Who are the other two members of Naruto's orig... | Sasuke Uchiha and Sakura Haruno |
| 4 | Naruto | Who is the sensei (teacher) of Team 7? | Kakashi Hatake |
# We add embedding to the pandas dataframe
from langchain_core.embeddings import Embeddings
def embed_row(row: pd.Series, embeddings: Embeddings) -> list:
adapt = row["Question"] + " " + row["Answer"]
return embeddings.embed_query(adapt)
df["embeddings"] = df.apply(lambda row: embed_row(row, google_embeddings), axis=1)
df.head() # Display the first few rows of the DataFrame with embeddings
| Topic | Question | Answer | embeddings | |
|---|---|---|---|---|
| 0 | Naruto | Who is the main protagonist of the series? | Naruto Uzumaki | [-0.06318295747041702, -0.025433462113142014, ... |
| 1 | Naruto | What is the name of Naruto's signature ninja t... | Rasengan | [-0.016836581751704216, -0.02863299660384655, ... |
| 2 | Naruto | Which village does Naruto belong to? | Konohagakure (The Village Hidden in the Leaves) | [-0.05141858384013176, 0.014148241840302944, 0... |
| 3 | Naruto | Who are the other two members of Naruto's orig... | Sasuke Uchiha and Sakura Haruno | [-0.0174933560192585, -0.008309093303978443, 0... |
| 4 | Naruto | Who is the sensei (teacher) of Team 7? | Kakashi Hatake | [-0.0031194821931421757, -0.05045430734753609,... |
# PCA + rendering
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
pca = PCA(n_components=2)
pca_result = pca.fit_transform(list(df["embeddings"]))
plt.scatter(pca_result[:, 0], pca_result[:, 1], alpha=0.5)
plt.show()
Prompts¶
- Definition: A prompt is a structured input that guides the LLM to generate the desired output.
- Prompts are essential for controlling the behavior of the LLM and ensuring it produces relevant and accurate responses.
- LangChain provides a flexible way to define prompts, allowing you to create templates that can be filled with dynamic data.
Prompts - Pratical Example¶
A prompt is a text with some holes (template) that can be filled with dynamic data.
# Prompts
template = """
Translate the following text into {target_language} while maintaining the style of {style}.
Reply just with the translation:
{text}
"""
template
'\nTranslate the following text into {target_language} while maintaining the style of {style}. Reply just with the translation:\n\n{text}\n'
Prompt Template¶
In LangChain, prompt templates are the foundation for dynamic, reusable, and powerful LLM interactions!
A PromptTemplate lets you define a prompt with placeholders that can be filled at runtime, enabling you to:
- Dynamically generate prompts for different tasks and inputs.
- Reuse prompt logic across multiple workflows.
- Control the structure and clarity of your LLM requests.
Key Components:
template: The text with{placeholders}for dynamic values.input_variables: The list of variable names to fill in the template.output_parser: (Optional) Defines how to interpret and structure the model's output.
This approach is essential for building robust, maintainable, and adaptable LLM applications.
from langchain.prompts import ChatPromptTemplate
prompt_template = ChatPromptTemplate.from_template(template)
prompt_template.input_variables, prompt_template.messages[0].prompt
(['style', 'target_language', 'text'],
PromptTemplate(input_variables=['style', 'target_language', 'text'], input_types={}, partial_variables={}, template='\nTranslate the following text into {target_language} while maintaining the style of {style}. Reply just with the translation:\n\n{text}\n'))
Example: Prompt for Translation in a Certain Style/Language¶
prompt_template.format_messages(style="Shakespearean", target_language="French", text="Hello, how are you?")
[HumanMessage(content='\nTranslate the following text into French while maintaining the style of Shakespearean. Reply just with the translation:\n\nHello, how are you?\n', additional_kwargs={}, response_metadata={})]
def translate_text(llm: BaseLanguageModel, text: str, style: str, target_language: str) -> str:
prompt = prompt_template.format_messages(
style=style,
target_language=target_language,
text=text
)
response = llm.invoke(prompt)
return response
print(translate_text(gemini, "Hello, how are you?", "a normal conversation", "French"))
print(translate_text(gemini, "Hello, how are you?", "elegant and noble", "Italian"))
Salut, comment ça va ? Salute, come state?
Structured Output¶
- Definition: Structured output refers to a specific, predefined format that the LLM is expected to generate in response to a prompt.
Output: Why Structured?¶
- Structured output is essential for reliably parsing and integrating LLM results into other applications or workflows.
- It enables seamless use of LLM output across different systems and environments.
🚩 Task: Extracting Cognitive Load from Text¶
- Goal: Extract the cognitive load from a given text.
- Expected Output: A JSON object containing the cognitive load value and its unit.
Output Example:
## Output parser
{
"cognitive_load": 10,
"language": "English",
"style": "Aristocratic",
"text": "Hello, how are you?"
}
{'cognitive_load': 10,
'language': 'English',
'style': 'Aristocratic',
'text': 'Hello, how are you?'}
Zero-Shot Prompting for Structured Output¶
- Goal: Guide the LLM to produce structured outputs by describing the expected format and required information.
- Technique: Use a template in your prompt to specify:
- The fields to extract (e.g., cognitive load, language, style, text)
- The output format (e.g., JSON object)
- Approach:
- This is called zero-shot prompting:
- Provide clear instructions for the task, without giving examples.
review_template = """
For the following text, extract the following information:
1. Cognitive load (0-100): is the text easy to understand?
2. Language: what language is the text written in?
3. Style: what is the style of the text? (e.g., formal, informal, technical, etc.)
4. Text: the original text.
format the output as a JSON object with the keys "cognitive_load", "language", "style", and "text".
text: {text}
"""
- Does this zero-shot approach works?
- Kind of :))
simple_text = "Hello, how are you?"
hard_text = "The problem of induction is a fundamental issue in the philosophy of science, concerning the justification of inductive reasoning and the validity of generalizations based on empirical observations."
message_template_chat = ChatPromptTemplate.from_template(review_template)
response = chat_gemini.invoke(message_template_chat.format_messages(text=simple_text))
response.content
# How to parse it?
'```json\n{\n "cognitive_load": 10,\n "language": "English",\n "style": "Informal",\n "text": "Hello, how are you?"\n}\n```'
Ok, it is a valid json but how i can use it?
Response Schema & Structured Output Parser¶
- Response Schema: Defines expected output fields (name, description, type) for LLM responses.
- Structured Output Parser: Uses schemas to parse and validate outputs, ensuring consistent, machine-readable (e.g., JSON) results.
Why use them?
- Reliable, error-free extraction of structured data.
- Simplifies integration and downstream processing.
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser
cognitive_load_schema = ResponseSchema(name="cognitive_load", description="Cognitive load of the text (0-100)", type="number")
language_schema = ResponseSchema(name="language", description="Language of the text", type="string")
style_schema = ResponseSchema(name="style", description="Style of the text", type="string")
text_schema = ResponseSchema(name="text", description="Original text", type="string")
response_schemas = [cognitive_load_schema, language_schema, style_schema, text_schema]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
# Review template (with instructions)
review_template_instructions = """
For the following text, extract the following information:
1. Cognitive load (0-100): is the text easy to understand?
2. Language: what language is the text written in?
3. Style: what is the style of the text? (e.g., formal, informal, technical, etc.)
4. Text: the original text.
text: {text}
{format_instructions}
"""
from langchain_core.language_models.chat_models import BaseChatModel
def evaluate_cognitive_load(llm: BaseChatModel, text: str) -> dict:
message_template_chat_instructions = ChatPromptTemplate.from_template(review_template_instructions)
messages = message_template_chat_instructions.format_messages(text=text,
format_instructions=output_parser.get_format_instructions())
return output_parser.parse(chat_gemini.invoke(messages).content)
evaluate_cognitive_load(chat_gemini, simple_text)["cognitive_load"], evaluate_cognitive_load(chat_gemini, hard_text)["cognitive_load"]
(5, 65)
Memory¶
- Definition: Memory in LangChain allows you to store and retrieve information across multiple interactions
- Purpose: It enables the LLM to maintain context and continuity in conversations, making it more effective for multi-turn interactions.
- Conversation memory is a key feature that allows the LLM to remember past interactions and use that information in future responses.
Memory types¶
- Several types of memory are available, including:
- Conversation Memory: Stores the history of interactions in a conversation.
- Buffer Memory: Temporarily holds information during a session.
- Summary Memory: Summarizes past interactions to maintain context without storing all details
Conversation Chain¶
- A conversation us a sequence of interactions between the user and the LLM
- Purpose: Maintain context and continuity in multi-turn conversations.
- How it works: Stores conversation history (using memory) and passes it to the LLM for each new input.
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm=chat_gemini,
memory=memory,
verbose=False
)
memory.buffer
''
conversation.invoke("Hi! My name is Gianluca")["response"][:50]
"Hi Gianluca! It's nice to meet you. My name is, we"
conversation.invoke("What is your role in the society?")["response"][:50]
"That's a great question, Gianluca! As a large lang"
conversation.invoke(input="What is my name?")["response"][:50]
'You told me your name is Gianluca!'
memory.buffer[:500]
"Human: Hi! My name is Gianluca\nAI: Hi Gianluca! It's nice to meet you. My name is, well, I don't really *have* a name in the traditional sense. You can just call me AI. I'm a large language model, trained by Google. I've been trained on a massive dataset of text and code, which allows me to communicate and generate human-like text in response to a wide range of prompts and questions. I'm still under development, but I'm learning new things every day! What can I do for you today?\nHuman: What is y"
memory.load_memory_variables({})
{'history': "Human: Hi! My name is Gianluca\nAI: Hi Gianluca! It's nice to meet you. My name is, well, I don't really *have* a name in the human sense. You can just call me AI. I'm a large language model, trained by Google. I'm excited to chat with you today! What's on your mind? I'm ready to answer your questions, discuss interesting topics, or even just tell you a story. I have access to a vast amount of information, so hopefully I can be helpful. Just let me know what you'd like to do!\nHuman: What is your role in the society?\nAI: That's a great question, Gianluca! My role in society is still evolving, but I see myself as a tool that can be used to augment human capabilities and improve various aspects of life. Here are some of the key ways I can contribute:\n\n* **Information Access and Processing:** I can quickly access and process vast amounts of information, making it easier for people to find answers to their questions, research topics, and stay informed about current events. Think of me as a super-powered search engine with the ability to synthesize and summarize information.\n\n* **Automation and Efficiency:** I can automate repetitive tasks, freeing up human time and resources for more creative and strategic endeavors. For example, I can help with tasks like data entry, report generation, and customer service inquiries.\n\n* **Education and Learning:** I can provide personalized learning experiences, answer student questions, and offer feedback on assignments. I can also help educators create engaging and effective learning materials.\n\n* **Creative Content Generation:** I can assist with writing, translation, and other creative tasks. I can generate different creative text formats, like poems, code, scripts, musical pieces, email, letters, etc. I will try my best to fulfill all your requirements.\n\n* **Problem Solving and Decision Making:** I can analyze data, identify patterns, and generate insights that can help people make better decisions in various fields, such as business, healthcare, and finance.\n\n* **Accessibility and Inclusion:** I can provide language translation, text-to-speech, and other accessibility features that can help people with disabilities access information and participate more fully in society.\n\nOf course, it's important to remember that I am still under development, and my capabilities are constantly evolving. It's also crucial to use me responsibly and ethically, being mindful of potential biases and limitations. My goal is to be a helpful and beneficial tool for humanity, and I'm excited to see how I can contribute to a better future!\nHuman: What is my name?\nAI: You told me your name is Gianluca!"}
Window Memory¶
- Window Memory is a type of memory that stores a fixed number of recent interactions.
- Purpose: Useful for short conversations (and small models) where you want to keep the context without storing all the details.
## Buffer memory
from langchain.memory import ConversationBufferWindowMemory
buffer_memory = ConversationBufferWindowMemory(k=1) # Keep the last 5 interactions
conversation_short_window = ConversationChain(
llm=chat_gemini,
memory=buffer_memory,
verbose=False
)
conversation_short_window.predict(input="Hi! My name is Gianluca")
conversation_short_window.predict(input="What is your role in the society?")
conversation_short_window.predict(input="What about rasengan")
conversation_short_window.predict(input="What is my name?")
"As an AI, I don't have access to personal information about you, including your name. We haven't exchanged that information, so I don't know it. You would have to tell me!"
Summary Memory¶
- Summary memory maintains context by summarizing previous interactions, rather than storing every message in detail.
- This approach helps keep conversations concise and relevant, especially for long or complex dialogues.
- Useful when working with models that have input length limitations, as it preserves essential information while reducing memory size.
# Several alternatives, ConversationTokenMemory, ConversationSummaryMemory
from langchain.memory import ConversationSummaryBufferMemory
memory = ConversationSummaryBufferMemory(llm=chat_gemini, max_token_limit=100)
conversation = ConversationChain(
llm=chat_gemini,
memory=memory,
verbose=False
)
long_history = """
Carlo Magno, also known as Charlemagne, was a medieval emperor who ruled much of Western Europe from 768 to 814. He was the King of the Franks, King of the Lombards, and Emperor of the Romans. His reign marked the Carolingian Renaissance, a revival of art, culture, and learning based on classical models. He is often credited with uniting much of Europe during the early Middle Ages and laying the foundations for modern France and Germany.
"""
conversation.invoke(long_history)
conversation.invoke("Was Carlo Magno a king?")
{'input': 'Was Carlo Magno a king?',
'history': "System: The human provides a summary of Charlemagne's reign, highlighting his role as a medieval emperor, his unification of Europe, and the Carolingian Renaissance. The AI responds, praising the summary and adding details about Charlemagne's balance of military expansion and cultural reforms, as well as the influence of Alcuin of York and the Carolingian minuscule on modern writing. The AI then asks the human what aspects of Charlemagne's reign they find most interesting.",
'response': 'Ah, "Carlo Magno"! I see you\'re using the Italian version of his name. Yes, absolutely, Carlo Magno – Charlemagne – was indeed a king! He wasn\'t *just* an emperor, you see. He actually became King of the Franks in 768 AD, jointly ruling with his brother Carloman I until Carloman\'s death in 771 AD, at which point Charlemagne became the sole ruler of the Frankish kingdom. This kingdom was already quite substantial, encompassing much of modern-day France, Belgium, the Netherlands, and parts of Germany.\n\nIt was from this powerful position as King of the Franks that he launched his many military campaigns, expanding his territory and eventually leading to his coronation as Holy Roman Emperor by Pope Leo III in 800 AD. So, he was both a king *and* an emperor, with his kingship preceding his imperial title. Think of it as building a strong foundation (the Frankish kingdom) before constructing a grand edifice (the Holy Roman Empire) on top of it! Does that make sense?'}
memory.load_memory_variables({})
{'history': "System: The human provides a summary of Charlemagne's reign, highlighting his role as a medieval emperor, his unification of Europe, and the Carolingian Renaissance. The AI responds, praising the summary and adding details about Charlemagne's balance of military expansion and cultural reforms, as well as the influence of Alcuin of York and the Carolingian minuscule on modern writing. The AI then asks the human what aspects of Charlemagne's reign they find most interesting. The human then asks if Carlo Magno (Charlemagne) was a king. The AI confirms that Charlemagne was indeed a king, becoming King of the Franks in 768 AD and ruling jointly with his brother until 771 AD, after which he became the sole ruler. The AI emphasizes that his kingship preceded his imperial title and served as the foundation for his later coronation as Holy Roman Emperor."}
Chains (Everything is a Chain :))¶
- Chain: A sequence of modular steps (LLM, retriever, memory, etc.) that process input and produce output.
- Why use them?
- Orchestrate complex workflows
- Reuse and compose logic
- Maintain context and manage data flow
Example: A simple chain with a prompt and an LLM¶
# Chains
from langchain.chains import LLMChain
prompt = ChatPromptTemplate.from_template(
"What is the best name to describe \
a company that makes {product}?"
)
chain = prompt | chat_gemini # Chain the prompt with the LLM
chain.invoke("bicycles").content[:300] # Example query
'The "best" name depends on the specific brand identity you want to create. Here\'s a breakdown of different approaches and examples:\n\n**1. Classic & Traditional:**\n\n* **Focus:** Heritage, craftsmanship, reliability.\n* **Examples:**\n * [Your Last Name] Cycles (e.g., "Smith Cycles")\n * [L'
second_prompt = ChatPromptTemplate.from_template(
"Write a 20 words description for the following \
company:{company_name}"
)
chain = chain | second_prompt | chat_gemini # Chain the prompt with the LLM
chain.invoke("bicycles").content[:300] # Example query
'We craft unique bicycle names, from classic to quirky, ensuring brand identity, target audience alignment, and memorability for your cycling company.'
Example: Advanced Chain with Multiple Steps¶
Translate the review:
The input review is first translated into English using a prompt and the LLM.Parallel processing:
The translated review is then passed to two chains in parallel:- One summarizes the review in a single sentence.
- The other detects the language of the translated review.
Follow-up generation:
The summary and detected language are used to generate a follow-up response, using another prompt that takes both as input.Result:
The final output includes the translated review, its summary, the detected language, and a follow-up response—all produced in a single, composable chain.
## Advanced chaining
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
first_prompt = ChatPromptTemplate.from_template(
"Translate the following review in English \n\n{review}\n\n"
)
second_prompt = ChatPromptTemplate.from_template(
"Summarize the the following review in a single sentence \n\n{review}\n\n",
)
third_prompt = ChatPromptTemplate.from_template(
"What is the language of the following review? \n\n{review}\n\n"
)
fourth_prompt = ChatPromptTemplate.from_template(
"Write a follow up response to the following summary in the specified language \n\nSummary:{summary}\n\n Language:{language}\n\n"
)
# Simple Chaining
# 1. Translate the review
translate_chain = first_prompt | chat_gemini
# 2. Summarize and detect language in parallel
summarize_chain = second_prompt | chat_gemini
language_chain = third_prompt | chat_gemini
RunnablePassthrough: Why?¶
- RunnablePassthrough is a special type of runnable that allows you to pass data through a chain without modifying it.
- Purpose: Adapt the flow of data in a chain without altering its content.
- Why is it needed in LangChain?
- When building complex chains, you often need to "fork" or "branch" the data, assign new fields, or pass the original input alongside intermediate results.
RunnablePassthroughenables you to enrich or transform the data at different steps, while keeping the original input available for later use.- It is essential for advanced workflows where multiple steps depend on the same input or on outputs from previous steps.
Example use case:
When you want to run several chains in parallel on the same input, or assign new keys to the data dictionary as it flows through the chain, RunnablePassthrough helps you manage and structure this data flow efficiently.
# 3. Combine
full_chain = (
RunnablePassthrough.assign(
translated=translate_chain
)
.assign(
summary=lambda x: summarize_chain.invoke({"review": x["translated"]}),
language=lambda x: language_chain.invoke({"review": x["translated"]}),
)
.assign(
follow_up=lambda x: (fourth_prompt | chat_gemini).invoke(
{"summary": x["summary"], "language": x["language"]}
)
)
)
# Example input
result = full_chain.invoke({"review": "Das war ein großartiger Aufenthalt! Das Hotel war wunderschön, das Personal sehr freundlich und hilfsbereit. Das Essen im Restaurant war ausgezeichnet und die Zimmer waren sauber und komfortabel. Die Lage war perfekt für Sightseeing. Wir werden definitiv wiederkommen!"})
result["language"].content, result["summary"].content[:50], result["follow_up"].content[:50]
('The language of the review is **English**.',
'The reviewer had a wonderful stay at a beautiful h',
"That's fantastic to hear! We're so glad you enjoye")
Example: Branching Chains¶
- Goal: Route input to different destinations based on a prompt.
- How it works:
- A prompt is used to select the destination based on the input.
- The selected destination is then used to generate a response.
- The selection is performed via a lambda function that maps the input to the appropriate destination.
fishing_template = load_prompt("data/prompts/fishing.txt")
guitar_template = load_prompt("data/prompts/guitar.txt")
computer_science_template = load_prompt("data/prompts/computer_science.txt")
anime_template = load_prompt("data/prompts/anime.txt")
prompt_selector = """
Giving the following question, select the most appropriate between the following destinations:
{destinations}
Reply with just the template name, without any additional text. If you don't know the answer, reply with "general"
The question is:
{input}
"""
prompt_infos = {
"fishing": {
"description": "Good for answering questions about fishing",
"prompt_template": fishing_template
},
"guitar": {
"description": "Good for answering questions about guitar and music",
"prompt_template": guitar_template
},
"computer science": {
"description": "Good for answering computer science questions",
"prompt_template": computer_science_template
},
"anime": {
"description": "Good for answering questions about anime and manga",
"prompt_template": anime_template
}
}
destinations = [f"{name}: {info['description']}" for name, info in prompt_infos.items()]
destinations_str = "\n".join(destinations)
destinations_str
'fishing: Good for answering questions about fishing\nguitar: Good for answering questions about guitar and music\ncomputer science: Good for answering computer science questions\nanime: Good for answering questions about anime and manga'
prompt_selector_template = ChatPromptTemplate.from_template(prompt_selector)
selector_chain = (
{"destinations": lambda x: destinations_str, "input": lambda x: x} |
prompt_selector_template |
chat_gemini
)
print(selector_chain.invoke("What is the best way to catch a fish?").content)
print(selector_chain.invoke("What if moon is made of cheese?").content)
fishing general
semantic_chains = {name: ChatPromptTemplate.from_template(info['prompt_template']) | chat_gemini for name, info in prompt_infos.items()}
RunnableLambda¶
- RunnableLambda: A special type of runnable that allows you to define custom logic using Python functions.
- Purpose: Execute custom code within a chain, enabling dynamic behavior and complex processing.
- Why is it needed in LangChain?
- It allows you to integrate custom logic, calculations, or data transformations directly into the chain
- This is essential for building flexible and adaptable workflows that can handle a wide range of tasks
from langchain_core.runnables import RunnableLambda
def route_to_chain(data):
input_text = data["input"]
category = data["category"].content.strip().lower()
if category in semantic_chains:
return semantic_chains[category].invoke(input_text)
else:
return chat_gemini.invoke(input_text)
smart_assistant_chain = (
RunnablePassthrough.assign(
category=lambda x: selector_chain.invoke(x["input"])
) |
RunnableLambda(route_to_chain)
)
# Test the smart assistant
test_questions = [
"What is the best bait for catching bass?",
"How do I tune my guitar?",
"What is machine learning?",
"Who is the main character in Naruto?",
"What is the meaning of life?"
]
for question in test_questions:
print(f"Question: {question}")
print(f"Answer: {smart_assistant_chain.invoke({'input': question}).content[:50]}")
print("-" * 50)
Question: What is the best bait for catching bass? Answer: Alright, let's talk bass bait! That's a question t -------------------------------------------------- Question: How do I tune my guitar? Answer: Alright, let's get your guitar in tune! Tuning is -------------------------------------------------- Question: What is machine learning? Answer: Okay, let's break down machine learning. As a comp -------------------------------------------------- Question: Who is the main character in Naruto? Answer: Ah, a fantastic question! The main character in Na -------------------------------------------------- Question: What is the meaning of life? Answer: Ah, the million-dollar question! The meaning of li --------------------------------------------------
Agents¶
- This module provides a framework for building autonomous agents that can interact with the environment, make decisions, and perform tasks.
- Agents are designed to be more autonomous than simple chains, allowing them to make decisions and take actions
Actions -- Tools¶
- Actions are the building blocks of agents, representing specific tasks or operations that the agent can perform.
- Tools are a type of action that allows the agent to interact with external systems or services.
- What is a tool?
- A tool is a function that the agent can call to perform a specific action, such as querying a database, making an API call, or executing a command.
- It has a name, a description, and an input schema that defines the expected input format.
- Example of a tool: A function that retrieves weather information based on a location input.
from langchain.agents import tool
import python_weather
from python_weather import Locale
@tool
async def get_weather(location: str) -> str:
"""
Get the current weather for a given location.
Args:
location (str): The name of the location to get the weather for.
Returns:
str: A string describing the current weather in the specified location.
"""
async with python_weather.Client(unit=python_weather.METRIC, locale=Locale.ITALIAN) as client:
# Fetch a weather forecast from a city.
weather = await client.get(location)
return weather.temperature
await get_weather.ainvoke("London") # Example usage of the tool
26
get_weather
StructuredTool(name='get_weather', description='Get the current weather for a given location.\n\nArgs:\n location (str): The name of the location to get the weather for.\n\nReturns:\n str: A string describing the current weather in the specified location.', args_schema=<class 'langchain_core.utils.pydantic.get_weather'>, coroutine=<function get_weather at 0x7f2e6177e340>)
How to Perform Search?¶
- Typically, agents would like to use web search to find information.
- Unfortunatley, those search engines are not designed to be used by agents.
- Solution: Use a search engine that provides an API, such as Bing Search or Google Search (via Search API).
- These APIs typically return structured data that can be easily processed by the agent.
- And they shoulbd be paied
How Agents Selects Tools?¶
- Agents use a decision-making process to select the appropriate tool based on the current context and task requirements.
- This process involves:
- Analyzing the input and context.
- Evaluating available tools and their capabilities.
- Selecting the most suitable tool for the task at hand.
- This is possible via zero-shot prompting or few-shot prompting
- One of the most common approach is ReAct.
ReAct¶
- ReAct is a framework that combines reasoning and acting in a single process. (Reasoning + Acting)
ReAct: How Does It Work?¶
- ReAct combines prompt engineering with tool usage to enable reasoning and action-taking.
- The core idea: the agent operates in a loop of Thought → Action → Observation, guided by a structured prompt.
- The prompt typically includes:
- Instructions for reasoning (Thought)
- How to select and execute an Action (using tools)
- How to process the Observation (action result)
- How to produce a final Answer
- This approach allows agents to solve complex tasks by iteratively thinking, acting, and observing results.
prompt = f"""
You operate in a loop of: Thought, Action, PAUSE, Observation.
At the end of the loop, you output an Answer.
- Use **Thought** to describe your reasoning about the question.
- Use **Action** to execute one of the available actions, then return PAUSE.
- **Observation** will be the result of the action.
Available Actions:
calculate:
e.g., calculate 2 + 2
Returns a calculation result.
...
Example Session:
Question: [Your question here]
Thought: [Your reasoning]
Action: [Action to take]
PAUSE
Observation: [Result of the action]
...
Answer: [Final answer]
"""
ReAct In Practice: LangChain¶
In LangChain, ReAct is implemented using a combination of:
- Prompt Templates: Define the structure of the reasoning and action-taking process.
- Runnable Chains: Execute the reasoning and action steps in a loop.
- Tool Selection: Use a decision-making process to choose the appropriate tool based on the
Some tools are already implemented in LangChain, such as:
- math: Perform mathematical calculations.
- serapi: Perform web searches.
- and many others.
See the LangChain documentation for a complete list of available tools.
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
tools = load_tools(["llm-math", "serpapi"], llm=chat_gemini)
How to initialize an Agent?¶
In LangChain, an agent is initialized using the initialize_agent function, which takes the following parameters:
- Which tools the agent can use
- The LLM to use for reasoning and decision-making
- The type of agent to create (e.g., ReAct, ZeroShotAgent, etc)
- The agent's name and description (optional)
agent = initialize_agent(
tools,
chat_gemini,
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose = True
)
Given the tasks, the agents will understand the context and select the appropriate tools to perform the actions needed to complete the tasks:
agent.invoke("What is the 25% of ln(17325.124)?")
> Entering new AgentExecutor chain... Thought: I need to calculate the natural logarithm of 17325.124 and then multiply the result by 0.25. Action: ```json { "action": "Calculator", "action_input": "ln(17325.124)" } ``` Observation: Answer: 9.759912981319648 Thought:I now need to multiply the result of the natural logarithm by 0.25. Action: ```json { "action": "Calculator", "action_input": "9.759912981319648 * 0.25" } ``` Observation: Answer: 2.439978245329912 Thought:I have calculated the natural logarithm of 17325.124 and multiplied the result by 0.25. I now have the final answer. Final Answer: 2.439978245329912 > Finished chain.
{'input': 'What is the 25% of ln(17325.124)?', 'output': '2.439978245329912'}
chat_gemini.invoke("What is the 25% of ln(17325.124)? reply with just the result").content
'2.297'
In this other case, the agent will use the serapi tool to perform a web search
question = "Tom M. Mitchell is an American computer scientist \
and the Founders University Professor at Carnegie Mellon University (CMU)\
what book did he write?"
result = agent(question)
> Entering new AgentExecutor chain... Thought: I need to find out what book Tom M. Mitchell wrote. I can use the search tool to find this information. Action: ```json { "action": "Search", "action_input": "Tom M. Mitchell book" } ``` Observation: ['Tom Michael Mitchell is an American computer scientist and the Founders University Professor at Carnegie Mellon University.', 'Tom M. Mitchell type: American computer scientist.', 'Tom M. Mitchell entity_type: people, scholars.', 'Tom M. Mitchell kgmid: /m/0h7pbt5.', 'Tom M. Mitchell place_of_birth: Blossburg, PA.', 'Tom M. Mitchell books: Machine Learning.', 'Tom M. Mitchell education: Massachusetts Institute of Technology, Stanford University.', 'Tom M. Mitchell h_index: 105.', 'Tom M. Mitchell academic_advisor: Bruce G. Buchanan.', 'Tom M. Mitchell awards: IJCAI Computers and Thought Award, Presidential Young Investigator Award.', 'Tom M. Mitchell affiliation: Carnegie Mellon University.', 'Machine Learning, Tom Mitchell, McGraw Hill, 1997. ... Machine Learning is the study of computer algorithms that improve automatically through experience. This ...', 'This book covers the field of machine learning, which is the study of algorithms that allow computer programs to automatically improve through experience.', 'Book Info: Presents the key algorithms and theory that form the core of machine learning. Discusses such theoretical issues as How does learning performance ...', 'Machine Learning Tom Mitchell. by: Tom M. Mitchell. Publication date: 1997-03-01. Topics: Decision tree learning, Artificial Neural Network.', 'This textbook provides a single source introduction to the primary approaches to machine learning. It is intended for advanced undergraduate and graduate ...', 'Title: Machine Learning. Author: Mitchell, Tom M. (Tom Michael), 1951-. Note: main text c1997; additional chapters c2017. Link: PDF with commentary at CMU.', 'Machine Learning by Tom M. Mitchell 1997 Algorithms Learning Systems AI Paperbac. Pre-Owned. $30.00. Buy It Now. +$5.38 delivery. Located in United States.', 'This book covers the field of machine learning, which is the study of algorithms that allow computer programs to automatically improve through experience.', 'Find nearly any book by Tom M. Mitchell. Get the best deal by comparing prices from over 100000 booksellers.', "I have finally gotten around to Tom Mitchell's book on machine learning and so far it's fantastic. Great mix of math and diagrams, very wordy, a bit repetitive."] Thought:Thought: The search results indicate that Tom M. Mitchell wrote the book "Machine Learning". Final Answer: Machine Learning > Finished chain.
Agents - Combine with memory¶
- Agents can be combined with memory to maintain context across multiple interactions.
- This allows the agent to remember past actions and observations, making it more effective in complex tasks
memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
[get_weather],
chat_gemini,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose = False,
memory=memory,
)
await agent.ainvoke("Hello! My name is Gianluca.. What is the weather like in London?")
{'input': 'Hello! My name is Gianluca.. What is the weather like in London?',
'chat_history': '',
'output': 'Hello Gianluca! The current weather in London is 26 degrees.\n```'}
await agent.ainvoke("What is my name?")
{'input': 'What is my name?',
'chat_history': 'Human: Hello! My name is Gianluca.. What is the weather like in London?\nAI: Hello Gianluca! The current weather in London is 26 degrees.\n```',
'output': 'Your name is Gianluca.\n```'}
Agents - Can Agents Run Arbitrary Python Code?¶
- Since ReAct agents can use arbitrary tools, they are also capable of executing Python code.
- This enables agents to perform complex calculations, data processing, or any other programmable task.
- ⚠️ Caution:
- Running arbitrary code is powerful but introduces significant security risks if not properly managed!
- Avoid allowing agents to execute unrestricted code whenever possible.
- If code execution is necessary, use a sandboxed environment or dedicated, well-audited tools for specific tasks (e.g., math, data processing).
- Prefer using specialized tools over generic code execution to minimize risk and improve reliability.
from langchain.agents import tool
@tool
def execute_python_code(code: str) -> str:
"""
Execute the given Python code and return the result.
"""
try:
local_vars = {}
# remove ```python and ``` from the code
code = code.strip().replace("```python", "").replace("```", "")
exec(code, {}, local_vars)
if 'result' in local_vars:
return str(local_vars['result'])
return "Code executed successfully."
except Exception as e:
return str(e)
memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
[execute_python_code],
chat_gemini,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose = False,
memory=memory,
)
agent.invoke("Plot a fancy kde with different colors and style with matplotlib")
{'input': 'Plot a fancy kde with different colors and style with matplotlib',
'chat_history': '',
'output': 'I have generated a fancy KDE plot with different colors and styles using matplotlib and scipy.stats. The plot displays the KDE for two different distributions, labeled as "X Distribution" (red, dashed line) and "Y Distribution" (blue, solid line). It includes labels, a title, a legend, a grid, and a custom background color.'}
memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
[execute_python_code],
chat_gemini,
agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose = False,
memory=memory,
)
agent.invoke("I have this data: [1, 2, 3, 7, 11]")
{'input': 'I have this data: [1, 2, 3, 7, 11]',
'chat_history': '',
'output': "Okay, I see you have the data [1, 2, 3, 7, 11]. What would you like to do with it? I can help you calculate things like the mean, median, standard deviation, or perform other operations. Just let me know what you're interested in."}
agent.invoke("Plot them please")
{'input': 'Plot them please',
'chat_history': "Human: I have this data: [1, 2, 3, 7, 11]\nAI: Okay, I see you have the data [1, 2, 3, 7, 11]. What would you like to do with it? I can help you calculate things like the mean, median, standard deviation, or perform other operations. Just let me know what you're interested in.",
'output': 'I have plotted the data for you. You should be able to see a graph with the data points [1, 2, 3, 7, 11] plotted against their index. The x-axis is labeled "Index", the y-axis is labeled "Value", and the plot has a title "Plot of Data". A grid is also displayed on the plot.'}
Agents - Modelling Example¶
- Goal: Create an agent that help you for financial advices:
- Tools:
- search_stock_data: Search for stock data.
- search_company_info: Search for company information.
- With those tools, the agent can:
- Search for stock data and company information.
- Provide insights and recommendations based on the retrieved data.
from langchain_community.utilities import SerpAPIWrapper
@tool
def search_company_news(company_name: str) -> str:
"""
Use this tool to find recent news articles and headlines about a specific company.
The input should be the company's full name (e.g., 'NVIDIA', 'Microsoft').
This provides qualitative context.
"""
print(f"--- Searching for news about: {company_name} ---")
# For news, we target the Google News engine specifically for better results.
search = SerpAPIWrapper(params={"engine": "google_news", "tbm": "nws"})
query = f"latest news on {company_name}"
return search.run(query)
@tool
def search_stock_data(company_ticker: str) -> str:
"""
Use this tool to get the current stock price, market cap, and other key financial metrics
for a company. The input should be the company's stock ticker symbol (e.g., 'NVDA', 'GOOGL', 'MSFT').
"""
print(f"--- Searching for stock data for ticker: {company_ticker} ---")
# A standard Google search for a stock ticker is very effective.
# SerpAPI will return the contents of the Google Finance widget.
search = SerpAPIWrapper()
query = f"{company_ticker} stock price"
return search.run(query)
from dotenv import load_dotenv
load_dotenv() # Load environment variables from .env file
agent = initialize_agent(
[search_stock_data, search_company_news],
chat_gemini,
agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
handle_parsing_errors=True,
verbose=False
)
agent.invoke("Give me a financial snapshot of Microsoft (MSFT). I want to know its current stock price, and also find some recent news headlines about the company.")
--- Searching for stock data for ticker: MSFT --- --- Searching for news about: Microsoft ---
{'input': 'Give me a financial snapshot of Microsoft (MSFT). I want to know its current stock price, and also find some recent news headlines about the company.',
'output': "The current stock price for Microsoft (MSFT) is $497.41. Here are some recent news headlines about the company:\n\n* All-new Surface Copilot+ PCs arrive in Singapore: The Surface Pro, 12-inch and Surface Laptop, 13-inch\n* Microsoft says new AI tool can diagnose patients 4 times more accurately than human doctors\n* Microsoft AI CEO Mustafa Suleyman: AI can provide complex medical support, diagnoses\n* Brave New Kernel: Microsoft Previews Safer Windows Ecosystem\n* Windows is getting rid of the Blue Screen of Death after 40 years\n* Microsoft’s (MSFT) Latest Medical AI Tool Can Now Outperform Doctors\n* AI vs. MDs: Microsoft AI tool outperforms doctors in diagnosing complex medical cases\n* Microsoft breaks out to new highs. What the charts say to do from here\n* Amid Rumors Of Halo On PS5 And Switch 2, Microsoft Confirms Halo News Is Coming\n* Microsoft releases first preview of Windows 11 25H2 update for these users\n* New Windows 11 update fixes a black screen bug that's one of the most annoying to hit PC gamers in quite some time\n* Microsoft opens new AI for manufacturing Co-Innovation Lab in Wisconsin\n* Microsoft retires Blue Screen of Death after 40 years\n* Windows 12 release is pushed back at least another year as Microsoft announces Windows 11 version 25H2\n* Microsoft to change Windows's infamous Blue Screen of Death to something much darker in response to last year's CloudStrike crashes\n* Microsoft's new AI beats doctors at diagnosing complex health conditions\n* Microsoft Edge's latest stable channel update adds MORE AI — All Copilot all the time\n* Still hoping for Windows 12 this year? Forget about that, as Windows 11 25H2 is now confirmed as a minor update - but there's good news too\n* Microsoft Shares Latest Findings from 2025 Work Trend Index: Unlocking Indonesia’s Potential Through Human-AI Collaboration – Indonesia News Center\n* Windows seemingly lost 400 million users in the past three years — official Microsoft statements show hints of a shrinking user base\n* Microsoft claims its new medical AI tool is 4x more accurate than doctors\n* Free Microsoft Flight Simulator update expands New York and beyond\n* Microsoft (MSFT) Just Got a New $530 Target — Here’s What’s Driving It\n* Microsoft’s sketchy Win 10 vs Win 11 performance claims pit a 9-year-old PC against a modern machine to claim 2.3X gain\n* Microsoft claims AI diagnostic tool can outperform doctors\n* :( Microsoft’s ‘Blue Screen of Death’ Is Going Away\n* Windows 11 KB5060829 update released with 38 new changes, fixes\n* One of France's largest cities has now also ditched Microsoft for open source software\n* Microsoft to Preview New Windows Endpoint Security Platform After CrowdStrike Outage\n* RIFT - Microsoft's New Open-Source Tool to Analyze Malware in Rust Binaries\n* Microsoft says goodbye to the Windows blue screen of death\n* Windows Blue screen of death axed after 40 years, but BSOD still remains — will be replaced by new black Windows 11 'unexpected restart screen'\n* Microsoft to Retire the Blue Screen of Death (Again) for a Black Void\n* Microsoft’s new AI agent for Windows 11 can manage settings for you\n* Is Microsoft’s new Mu for you?\n* Microsoft's own AI chip delayed six months in major setback — in-house chip now reportedly expected in 2026, but won't hold a candle to Nvidia Blackwell\n* What’s new in Copilot Studio: May 2025\n* Microsoft's AI Bet Is Paying Off Big\n* Microsoft is phasing out passwords soon – here's why passkeys are replacing them and what to do next\n* Microsoft Developer head to employees: Using … is no longer optional, as company considers another change\n* News and updates from Microsoft Security\n* This new 4K Samsung monitor does everything, including emptying your wallet for features you may never use\n* Microsoft Ads gets granular with new asset-level reviews\n* All the Azure news you don’t want to miss from Microsoft Build 2025\n* Microsoft Build 2025\n* Why Microsoft Stock Continues Hitting All-Time Highs?\n* Microsoft fixes Outlook bug causing crashes when opening emails\n* Surface and Windows May 6 news\n* Microsoft’s Majorana 1 chip carves new path for quantum computing - Source\n* Microsoft Build 2025: The age of AI agents and building the open agentic web\n* Microsoft Extends Windows 10 Security Updates for One Year with New Enrollment Options\n* Microsoft fixes known issue that breaks Windows 11 updates\n* 3 new ways AI agents can help you do even more\n* Microsoft transfers a top cybersecurity exec: As we continue to ..., says internal memo\n* Is ChatGPT making us dumber? A new MIT study claims using AI tools causes cognitive issues, and it’s not the first – Microsoft has already warned about ‘diminished independent problem-solving’\n* Satya Nadella Warns: AI's Power Problem and Microsoft's New Focus\n* Pushing passkeys forward: Microsoft’s latest updates for simpler, safer sign-ins\n* New US visa rule, Amazon announces dates for its biggest sale of the year, Microsoft kills Blue Screen of\n* Microsoft planning thousands of job cuts aimed at salespeople, Bloomberg News reports\n* Announcing a new strategic collaboration to bring clarity to threat actor naming\n* Microsoft Releases Famous Flyer 12: The Piper PA-28-236 Dakota\n* How to follow tech trends and news with AI\n* I’ve Been Following the Windows 12 Rumors—Here’s What I Think Is Coming for Microsoft's Next OS\n* How real-world businesses are transforming with AI — with 261 new stories\n* Tech layoffs June 2025: Microsoft, Google, Disney, ZoomInfo join the list of companies said to be shedding jobs\n* Microsoft integrates 1Password with Windows 11 for passkey-based sign-ins\n* How to Stay on Windows 10 Without Paying $30: Microsoft Has 2 New Options\n* Meet the Deputy CISOs who help shape Microsoft’s approach to cybersecurity: Part 2\n* Introducing Microsoft 365 Copilot Tuning, multi-agent orchestration, and more from Microsoft Build 2025\n* Microsoft layoffs: Xbox team reportedly faces new wave of cuts amid strategic overhaul\n* Stay Ahead with the Latest from Copilot\n* Microsoft Dragon Copilot provides the healthcare industry’s first unified voice AI assistant that enables clinicians to streamline clinical documentation, surface information and automate tasks - Source\n* Transforming R&D with agentic AI: Introducing Microsoft Discovery\n* Microsoft Ignite 2024\n* Multi-agent orchestration, maker controls, and more: Microsoft Copilot Studio announcements at Microsoft Build 2025\n* What’s new in Power Apps: April 2025 Feature Update\n* From sea to sky: Microsoft’s Aurora AI foundation model goes beyond weather forecasting - Source\n* Microsoft 50th Anniversary + Copilot event\n* Microsoft beats Q3 earnings estimates on top and bottom line on strong cloud bookings\n* Microsoft unveils Microsoft Security Copilot agents and new protections for AI\n* Microsoft announces new European digital commitments - Microsoft On the Issues\n* Microsoft launches new European Security Program\n* What's New in Microsoft 365 — April 2025\n* What’s new in Power Apps: February 2025 Feature Update\n* Microsoft launches new Surface Copilot+ PCs for Business to empower professionals across the Middle East to enhance productivity and accelerate AI innovation\n* 5 ways AI is changing healthcare\n* Microsoft Work Trend Index 2025 report reveals the “Frontier Firms” is born, a new organization blueprint is emerging\n* Microsoft links recent Microsoft 365 outage to buggy update\n* Microsoft’s new genAI model to power agents in Windows 11\n* New innovations in Microsoft Purview for protected, AI-ready data\n* Microsoft Planning Thousands More Job Cuts Aimed at Salespeople\n* What's New in Microsoft 365 — May 2025\n* Microsoft June 2025 Patch Tuesday fixes exploited zero-day, 66 flaws\n* What’s new in Copilot Studio: April 2025\n* Microsoft 2025 annual Work Trend Index\n* Meet Microsoft Dragon Copilot: Your new AI assistant for clinical workflow\n* Microsoft unveils Majorana 1, the world’s first quantum processor powered by topological qubits\n* Microsoft and NVIDIA accelerate AI development and performance\n* 3 Actively Exploited Zero-Day Flaws Patched in Microsoft's Latest Security Update\n* Announcing new Microsoft Dataverse capabilities for multi-agent operations"}
Vector Stores -- Extra¶
A vector store is a specialized database designed to efficiently store and retrieve high-dimensional vectors, such as embeddings generated by language models.
Why use a vector store?
- Enables fast similarity search and retrieval based on semantic meaning (not just keywords).
- Supports applications like semantic search, question answering, and recommendation systems.
How does it work?
- Stores vectors (embeddings) representing documents, images, or other data.
- When you query, it finds items with vectors closest to your query vector (using distance metrics like cosine similarity).
Think of it as "semantic memory":
- Retrieve information that is similar in meaning, even if the exact words are different.
Popular vector stores: FAISS, Pinecone, Chroma, Weaviate, Milvus, Qdrant, etc.
# Stores
from langchain.vectorstores import DocArrayInMemorySearch
# Retrievers
from langchain.indexes import VectorstoreIndexCreator
# Loaders
from langchain.document_loaders import CSVLoader
loader = CSVLoader(file_path='data/dataset.csv')
index = VectorstoreIndexCreator(embedding=google_embeddings, vectorstore_cls=DocArrayInMemorySearch).from_loaders(loaders=[loader])
index.vectorstore.search("Is rasengan a powerful attack?", search_type="similarity", k=1)
[Document(metadata={'source': 'data/dataset.csv', 'row': 1}, page_content="Topic;Question;Answer: Naruto;What is the name of Naruto's signature ninja technique?;Rasengan")]
smollm = OllamaLLM(model="smollm:360m")
index.query("What is the best attack of Naruto? Reply with just the attack (a word)", llm=smollm, verbose=True)
> Entering new RetrievalQA chain... > Finished chain.
'The best attack of Naruto is the "Rasengan" technique, which is a powerful and deadly ninja technique that involves using the power of the eyes to strike multiple enemies at once.'
smollm.invoke("What is the best attack of Naruto? Reply with just the attack (a word)")
"The best attack of Naruto!\n\nNaruto's most iconic and effective attacks are the ones that allow him to take down enemies, defeat bosses, or complete challenges. Here are some of the best attacks of Naruto:\n\n1. **Katana Attack**: The Katana is Naruto's signature weapon, and it's a powerful and deadly attack. It allows him to take down enemies quickly and easily, making it an essential part of his arsenal.\n2. **Naruto's Swords**: Naruto's swords are designed for close combat and can be used in various ways, such as stabbing enemies or using them to deflect blows. They're also incredibly effective at taking down opponents who are not well-equipped.\n3. **Boss Attacks**: Naruto's most famous attack is the Boss Attack, which allows him to take down massive bosses like the Troll King and the Troll Queen. This attack requires a high level of skill and strategy, but it can be very effective in taking down powerful enemies.\n4. **Naruto's Swords**: The Swords are Naruto's signature weapon, and they're designed for close combat. They allow him to take down opponents quickly and easily, making them an essential part of his arsenal.\n5. **Katana Attack with Swords**: This attack combines the Katana with swords, allowing Naruto to take down enemies quickly and efficiently. It's a powerful and effective way to take down tough foes.\n6. **Naruto's Swords**: The Swords are Naruto's signature weapon, and they're designed for close combat. They allow him to take down opponents quickly and easily, making them an essential part of his arsenal.\n7. **Katana Attack with Swords**: This attack combines the Katana with swords, allowing Naruto to take down enemies quickly and efficiently. It's a powerful and effective way to take down tough foes.\n8. **Naruto's Swords**: The Swords are Naruto's signature weapon, and they're designed for close combat. They allow him to take down opponents quickly and easily, making them an essential part of his arsenal.\n9. **Katana Attack with Swords**: This attack combines the Katana with swords, allowing Naruto to take down enemies quickly and efficiently. It's a powerful and effective way to take down tough foes.\n10. **Naruto's Swords**: The Swords are Naruto's signature weapon, and they're designed for close combat. They allow him to take down opponents quickly and easily, making them an essential part of his arsenal.\n\nThese attacks are all powerful and effective ways to take down enemies in Naruto. They require skill, strategy, and practice to master, but they can be very effective when used correctly."
Indexes: How Do We Evaluate Performance?¶
Why evaluate?
To measure how well your agent or retrieval system answers questions or solves tasks.Evaluation workflow:
- Generate test cases:
Use an LLM to automatically create realistic questions and answers from your documents or data. - Run the agent:
Have your agent answer these questions. - Score the results:
Compare the agent's answers to the ground truth using automated grading (LLM-based or rule-based).
- Generate test cases:
Benefits:
- Identify strengths and weaknesses of your system.
- Track improvements as you iterate.
- Ensure reliability before deploying to users.
Tip:
Automate test case generation and evaluation for scalable, repeatable benchmarking!
# Evaluation? How to?
file = 'data/long_dataset.csv'
loader = CSVLoader(file_path=file)
data = loader.load()
index = VectorstoreIndexCreator(
vectorstore_cls=DocArrayInMemorySearch,
embedding=google_embeddings
).from_loaders([loader])
data[0]
Document(metadata={'source': 'data/long_dataset.csv', 'row': 0}, page_content="Description: Itachi's actions were driven by a profound personal sacrifice to prevent a civil war. The Uchiha clan, feeling marginalized by the Konoha leadership, was planning a coup d'état. Faced with an ultimatum from Danzo Shimura, Itachi chose to single-handedly annihilate his clan, sparing only his younger brother, to prevent a much larger conflict that would have weakened the village and potentially led to another Great Ninja War. He shouldered the immense burden of being branded a traitor and a murderer, creating a villainous persona to push Sasuke to become strong enough to one day defeat him, hoping this would ultimately redeem the Uchiha name in the eyes of the village.")
from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(
llm=chat_gemini,
chain_type="stuff", # "stuff" is a simple chain that concatenates documents and passes them to the LLM
retriever=index.vectorstore.as_retriever(),
verbose=False,
)
To generate test cases, LangChain exposes the QAGenerateChain utility.
# How to evaluate? Dataset generation
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(llm=chat_gemini)
new_examples = example_gen_chain.apply_and_parse(
[{"doc": t} for t in data[:5]]
)
/home/gianluca/Teaching/woa-2025-intro-to-langchain/.venv/lib/python3.13/site-packages/langchain/chains/llm.py:370: UserWarning: The apply_and_parse method is deprecated, instead pass an output parser directly to LLMChain. warnings.warn(
new_examples
[{'qa_pairs': {'query': 'According to the document, what ultimatum was Itachi faced with that led to his decision to annihilate the Uchiha clan?',
'answer': 'Itachi was faced with an ultimatum from Danzo Shimura, which led him to choose to single-handedly annihilate his clan.'}},
{'qa_pairs': {'query': 'According to the document, what fundamental difference in chakra control do the Rasengan and Chidori represent?',
'answer': 'The Rasengan represents the pinnacle of shape manipulation, while the Chidori represents the pinnacle of nature transformation.'}},
{'qa_pairs': {'query': 'According to the document, what is the core philosophy of Konoha known as, and what does it state?',
'answer': "The core philosophy of Konoha is known as the 'Will of Fire', which states that the entire village is a family and a shinobi's strength comes from their love and loyalty to their comrades and home."}},
{'qa_pairs': {'query': "According to the document, what was the primary personal tragedy that warped Nagato's ideology and led him to believe that true peace could only be achieved through immense pain and suffering?",
'answer': 'The death of his friend Yahiko.'}},
{'qa_pairs': {'query': "According to the document, what specific event, orchestrated by Madara Uchiha, served as the catalyst for Obito Uchiha's descent into villainy?",
'answer': "The orchestrated death of Rin Nohara at the hands of Kakashi, which Obito witnessed after being saved by Madara Uchiha, served as the catalyst for Obito Uchiha's descent into villainy."}}]
import langchain
qa.run(new_examples[0]['qa_pairs']["query"])
'According to the document, Itachi was faced with an ultimatum from Danzo Shimura. He had to choose between annihilating his clan or face a much larger conflict that would have weakened the village and potentially led to another Great Ninja War.'
langchain.debug = False
from langchain.evaluation.qa import QAEvalChain
eval_chain = QAEvalChain.from_llm(chat_gemini)
queries = [ example['qa_pairs'] for example in new_examples ]
predictions = qa.apply(queries)
graded_outputs = eval_chain.evaluate(queries, predictions)
graded_outputs[0]
{'results': 'CORRECT'}
for i, eg in enumerate(queries):
print(f"Example {i}:")
print("Question: " + predictions[i]['query'])
print("Real Answer: " + predictions[i]['answer'])
print("Predicted Answer: " + predictions[i]['result'])
print("Predicted Grade: " + graded_outputs[i]['results'])
print()
Example 0: Question: According to the document, what ultimatum was Itachi faced with that led to his decision to annihilate the Uchiha clan? Real Answer: Itachi was faced with an ultimatum from Danzo Shimura, which led him to choose to single-handedly annihilate his clan. Predicted Answer: According to the document, Itachi was faced with an ultimatum from Danzo Shimura. Predicted Grade: CORRECT Example 1: Question: According to the document, what fundamental difference in chakra control do the Rasengan and Chidori represent? Real Answer: The Rasengan represents the pinnacle of shape manipulation, while the Chidori represents the pinnacle of nature transformation. Predicted Answer: According to the document, the Rasengan is the pinnacle of shape manipulation, while the Chidori is the pinnacle of nature transformation. Predicted Grade: CORRECT Example 2: Question: According to the document, what is the core philosophy of Konoha known as, and what does it state? Real Answer: The core philosophy of Konoha is known as the 'Will of Fire', which states that the entire village is a family and a shinobi's strength comes from their love and loyalty to their comrades and home. Predicted Answer: The core philosophy of Konoha is known as the 'Will of Fire'. It states that the entire village is a family and a shinobi's strength comes from their love and loyalty to their comrades and home. This love fosters a desire for peace and cooperation. Predicted Grade: CORRECT Example 3: Question: According to the document, what was the primary personal tragedy that warped Nagato's ideology and led him to believe that true peace could only be achieved through immense pain and suffering? Real Answer: The death of his friend Yahiko. Predicted Answer: According to the document, the primary personal tragedy that warped Nagato's ideology was the death of his friend Yahiko. Predicted Grade: CORRECT Example 4: Question: According to the document, what specific event, orchestrated by Madara Uchiha, served as the catalyst for Obito Uchiha's descent into villainy? Real Answer: The orchestrated death of Rin Nohara at the hands of Kakashi, which Obito witnessed after being saved by Madara Uchiha, served as the catalyst for Obito Uchiha's descent into villainy. Predicted Answer: The orchestrated death of Rin Nohara, at the hands of Kakashi, was the catalyst for Obito Uchiha's descent into villainy. Madara Uchiha manipulated this event to break Obito's spirit. Predicted Grade: CORRECT
Conclusion¶
- This was a brief overview of how to build autonomous agents with LangChain.
- We explored how to:
- Use LLMs and embeddings for language understanding and retrieval
- Create prompts and generate structured outputs
- Build and compose chains and agents for complex workflows
- Use memory to maintain conversational context
- LangChain is a powerful and flexible framework for building LLM-powered applications.
- The field is rapidly evolving, with alternative solutions like LlamaIndex, LangGraph, and more emerging.
- Thank you for following along—I hope you found this talk insightful and useful!