LCEL Introduction
LCEL (LangChain Expression Language) is a powerful workflow orchestration tool that allows you to build complex task chains from basic components and supports out-of-the-box features such as streaming processing, parallel processing, and logging.
Basic Example: Prompt + Model + Output Parser
In this example, we will demonstrate how to use LCEL (LangChain Expression Language) to link three components – prompt template, model, and output parser – together to form a complete workflow for implementing the task of "telling jokes." The code demonstrates how to create chains, use the pipe symbol |
to connect different components, and introduces the role of each component along with the output results.
First, let's see how to connect the prompt template and the model to generate a joke about a specific topic:
Install dependencies
%pip install --upgrade --quiet langchain-core langchain-community langchain-openai
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
model = ChatOpenAI(model="gpt-4")
output_parser = StrOutputParser()
chain = prompt | model | output_parser
chain.invoke({"topic": "ice cream"})
Output
"Why don't parties invite ice cream? Because it melts when it's hot!"
In this code, we use LCEL to connect different components into a chain:
chain = prompt | model | output_parser
The |
symbol here is similar to the Unix pipe operator, which connects different components together and passes the output of one component as the input to the next component.
In this chain, user input is passed to the prompt template, then the output of the prompt template is passed to the model, and finally the output of the model is passed to the output parser. Let's take a look at each component separately to better understand what's happening.
1. Prompt
prompt
is a BasePromptTemplate
that accepts a template variable dictionary and generates a PromptValue
. PromptValue
is a wrapped object containing the prompt, which can be passed to LLM
(as input in the form of a string) or ChatModel
(as input in the form of message sequences). It can be used with any type of language model because it defines the logic for generating BaseMessage
and generating strings.
prompt_value = prompt.invoke({"topic": "ice cream"})
prompt_value
Output
ChatPromptValue(messages=[HumanMessage(content='Tell me a joke about ice cream')])
Below, we convert the prompt-formatted result into the message format used by chat models:
prompt_value.to_messages()
Output
[HumanMessage(content='Tell me a joke about ice cream')]
It can also be directly converted to a string:
prompt_value.to_string()
Output
'Human: Tell me a joke about ice cream.'
2. Model
Next, pass the PromptValue
to the model
. In this example, our model
is a ChatModel
, which means it will output a BaseMessage
.
Try calling the model
directly:
message = model.invoke(prompt_value)
message
Returns:
AIMessage(content="Why is ice cream never invited to parties?\n\nBecause they're always a drip when things heat up!")
If our model
is defined as an LLM
type, it will output a string.
from langchain_openai.llms import OpenAI
llm = OpenAI(model="gpt-3.5-turbo-instruct")
llm.invoke(prompt_value)
Model returns:
'\n\nBot: Why did the ice cream truck break down? Because it went through a meltdown!'
3. Output Parser
Finally, pass the output from our model
to the output_parser
, which is a BaseOutputParser
, meaning it accepts a string or BaseMessage
as input. StrOutputParser
specifically converts any input into a simple string.
output_parser.invoke(message)
Why is ice cream never invited to parties? \n\nBecause they're always a drip when things heat up!
4. The Entire Process
The execution process is as follows:
- Call
chain.invoke({"topic": "ice cream"})
, which is equivalent to initiating the workflow we defined and passing the parameter{"topic": "ice cream"}
to generate a joke about "ice cream." - Pass the call parameter
{"topic": "ice cream"}
to the first component of the chain,prompt
, which formats the prompt template to obtain the promptTell me a little joke about ice cream
. - Pass the prompt
Tell me a little joke about ice cream
to themodel
(gpt4 model). - Pass the result returned by the
model
to theoutput_parser
output parser, which formats the model result and returns the final content.
If you are interested in the output of any component, you can test a smaller version of the chain at any time, such as prompt
or prompt | model
, to see the intermediate results:
input = {"topic": "ice cream"}
prompt.invoke(input)
(prompt | model).invoke(input)
RAG Search Example
Next, let's explain a slightly more complex LCEL example. We will run an example of enhanced retrieval to generate chains, in order to add some background information when answering questions.
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings
vectorstore = DocArrayInMemorySearch.from_texts(
["harrison worked at kensho", "bears like to eat honey"],
embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()
output_parser = StrOutputParser()
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
chain.invoke("where did harrison work?")
In this case, the composed chain is:
chain = setup_and_retrieval | prompt | model | output_parser
Simply put, the above prompt template accepts context
and question
as values to replace in the prompt. Before constructing the prompt template, we want to retrieve relevant documents to use as part of the context.
As a test, we use DocArrayInMemorySearch
to simulate a memory-based vector database, defining a retriever that can retrieve similar documents based on queries. This is also a chainable runnable component, but you can also try running it separately:
retriever.invoke("where did harrison work?")
Then, we use RunnableParallel
to prepare the input for the prompt, search for documents using the retriever, and pass the user's question using RunnablePassthrough
:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
In summary, the complete chain is:
setup_and_retrieval = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser
The process is as follows:
- First, create a
RunnableParallel
object containing two entries. The first entrycontext
will include the document results extracted by the retriever. The second entryquestion
will contain the user's original question. To pass the question, we useRunnablePassthrough
to copy this entry. - Pass the dictionary from the previous step to the
prompt
component. It accepts user input (i.e.,question
) as well as the retrieved documents (i.e.,context
), constructs a prompt, and outputs aPromptValue
. - The
model
component takes the generated prompt and passes it to OpenAI's LLM model for evaluation. The output generated by the model is aChatMessage
object. - Finally, the
output_parser
component takes aChatMessage
, converts it to a Python string, and returns it from theinvoke
method.