LCEL Introduction

LCEL (LangChain Expression Language) is a powerful workflow orchestration tool that allows you to build complex task chains from basic components and supports out-of-the-box features such as streaming processing, parallel processing, and logging.

Basic Example: Prompt + Model + Output Parser

In this example, we will demonstrate how to use LCEL (LangChain Expression Language) to link three components – prompt template, model, and output parser – together to form a complete workflow for implementing the task of "telling jokes." The code demonstrates how to create chains, use the pipe symbol | to connect different components, and introduces the role of each component along with the output results.

First, let's see how to connect the prompt template and the model to generate a joke about a specific topic:

Install dependencies

%pip install --upgrade --quiet langchain-core langchain-community langchain-openai
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
model = ChatOpenAI(model="gpt-4")
output_parser = StrOutputParser()

chain = prompt | model | output_parser

chain.invoke({"topic": "ice cream"})

Output

"Why don't parties invite ice cream? Because it melts when it's hot!"

In this code, we use LCEL to connect different components into a chain:

chain = prompt | model | output_parser

The | symbol here is similar to the Unix pipe operator, which connects different components together and passes the output of one component as the input to the next component.

In this chain, user input is passed to the prompt template, then the output of the prompt template is passed to the model, and finally the output of the model is passed to the output parser. Let's take a look at each component separately to better understand what's happening.

1. Prompt

prompt is a BasePromptTemplate that accepts a template variable dictionary and generates a PromptValue. PromptValue is a wrapped object containing the prompt, which can be passed to LLM (as input in the form of a string) or ChatModel (as input in the form of message sequences). It can be used with any type of language model because it defines the logic for generating BaseMessage and generating strings.

prompt_value = prompt.invoke({"topic": "ice cream"})
prompt_value

Output

ChatPromptValue(messages=[HumanMessage(content='Tell me a joke about ice cream')])

Below, we convert the prompt-formatted result into the message format used by chat models:

prompt_value.to_messages()

Output

[HumanMessage(content='Tell me a joke about ice cream')]

It can also be directly converted to a string:

prompt_value.to_string()

Output

'Human: Tell me a joke about ice cream.'

2. Model

Next, pass the PromptValue to the model. In this example, our model is a ChatModel, which means it will output a BaseMessage.

Try calling the model directly:

message = model.invoke(prompt_value)
message

Returns:

AIMessage(content="Why is ice cream never invited to parties?\n\nBecause they're always a drip when things heat up!")

If our model is defined as an LLM type, it will output a string.

from langchain_openai.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo-instruct")
llm.invoke(prompt_value)

Model returns:

'\n\nBot: Why did the ice cream truck break down? Because it went through a meltdown!'

3. Output Parser

Finally, pass the output from our model to the output_parser, which is a BaseOutputParser, meaning it accepts a string or BaseMessage as input. StrOutputParser specifically converts any input into a simple string.

output_parser.invoke(message)
Why is ice cream never invited to parties? \n\nBecause they're always a drip when things heat up!

4. The Entire Process

The execution process is as follows:

  1. Call chain.invoke({"topic": "ice cream"}), which is equivalent to initiating the workflow we defined and passing the parameter {"topic": "ice cream"} to generate a joke about "ice cream."
  2. Pass the call parameter {"topic": "ice cream"} to the first component of the chain, prompt, which formats the prompt template to obtain the prompt Tell me a little joke about ice cream.
  3. Pass the prompt Tell me a little joke about ice cream to the model (gpt4 model).
  4. Pass the result returned by the model to the output_parser output parser, which formats the model result and returns the final content.

If you are interested in the output of any component, you can test a smaller version of the chain at any time, such as prompt or prompt | model, to see the intermediate results:

input = {"topic": "ice cream"}

prompt.invoke(input)

(prompt | model).invoke(input)

RAG Search Example

Next, let's explain a slightly more complex LCEL example. We will run an example of enhanced retrieval to generate chains, in order to add some background information when answering questions.

from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings

vectorstore = DocArrayInMemorySearch.from_texts(
    ["harrison worked at kensho", "bears like to eat honey"],
    embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser

chain.invoke("where did harrison work?")

In this case, the composed chain is:

chain = setup_and_retrieval | prompt | model | output_parser

Simply put, the above prompt template accepts context and question as values to replace in the prompt. Before constructing the prompt template, we want to retrieve relevant documents to use as part of the context.

As a test, we use DocArrayInMemorySearch to simulate a memory-based vector database, defining a retriever that can retrieve similar documents based on queries. This is also a chainable runnable component, but you can also try running it separately:

retriever.invoke("where did harrison work?")

Then, we use RunnableParallel to prepare the input for the prompt, search for documents using the retriever, and pass the user's question using RunnablePassthrough:

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
)

In summary, the complete chain is:

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser

The process is as follows:

  1. First, create a RunnableParallel object containing two entries. The first entry context will include the document results extracted by the retriever. The second entry question will contain the user's original question. To pass the question, we use RunnablePassthrough to copy this entry.
  2. Pass the dictionary from the previous step to the prompt component. It accepts user input (i.e., question) as well as the retrieved documents (i.e., context), constructs a prompt, and outputs a PromptValue.
  3. The model component takes the generated prompt and passes it to OpenAI's LLM model for evaluation. The output generated by the model is a ChatMessage object.
  4. Finally, the output_parser component takes a ChatMessage, converts it to a Python string, and returns it from the invoke method.