Introduction

LangChain is an open-source Python AI application development framework that provides modules and tools for building AI applications based on large models. Through LangChain, developers can easily integrate with Large Language Models (LLMs) to complete tasks such as text generation, question answering, translation, and dialogue. LangChain lowers the barrier to AI application development, allowing anyone to build their own creative applications based on LLM.

Features of LangChain:

  • LLM and Prompt: LangChain abstracts all LLM large models' APIs, unifies the access API of large models, and provides a prompt template management mechanism.
  • Chain: LangChain encapsulates some common scenarios into ready-made modules, such as context-based question answering systems, natural language generation of SQL queries, and so on, named "Chain" because the process of implementing these tasks is like a workflow, executed step by step.
  • LCEL: LangChain Expression Language (LCEL) is the core feature of the new version of LangChain, used to solve workflow orchestration problems. Through LCEL expressions, we can flexibly customize the AI task processing flow, i.e., flexibly customize the "Chain".
  • Retrieval Augmented Generation (RAG): Because Large Language Models (LLMs) do not understand new information and cannot answer new questions, we can import new information into LLM to enhance the quality of generated content. This mode is called RAG mode (Retrieval Augmented Generation).
  • Agents: A design pattern based on Large Language Models (LLMs) that utilize LLM's natural language understanding and reasoning capabilities (LLM as the brain) to automatically call external systems and devices together to complete tasks based on user requirements. For example, when a user inputs "take a day off tomorrow", the Large Language Model (LLM) automatically calls the leave system and initiates a leave application.
  • Model Memory: Allows the Large Language Model (LLM) to remember previous conversation content, a capability known as model memory.

LangChain Framework Components

LangChain Framework Components

The LangChain framework consists of several components, including:

  • LangChain Library: Python and JavaScript libraries. It includes the runtime foundation for interfaces and integrating various components, as well as the implementation of ready-made chains and agents.
  • LangChain Templates: Official AI task templates provided by LangChain.
  • LangServe: Based on FastAPI, it can publish the chains defined by LangChain as REST APIs.
  • LangSmith: A development platform, is a cloud service that supports LangChain debugging and task monitoring.

LangChain Library (Libraries)

The LangChain library itself consists of several different packages.

  • langchain-core: Basic abstractions and LangChain expression language.
  • langchain-community: Third-party integrations, mainly including third-party components integrated with LangChain.
  • langchain: Mainly includes chains, agents, and retrieval strategies.

Langchain Task Processing Flow

As shown in the above image, LangChain provides a prompt template management tool set for handling prompts. It then passes the prompts on to the large model for processing, and finally processes the results returned by the large model.

LangChain's encapsulation of the large model mainly includes two types: LLM and Chat Model.

  • LLM - Question-answering model that receives a text input and returns a text result.
  • Chat Model - Dialogue model that receives a set of dialogue messages and returns dialogue messages in a conversational manner.

Core Concepts

1. LLMs

The fundamental models encapsulated by LangChain that receive a text input and return a text result.

2. Chat Models

Chat models (or dialogue models) designed specifically for dialogue scenarios, unlike LLMs. These models can receive a set of dialogue messages and return dialogue messages in a conversational format.

3. Messages

Refers to the content of messages in chat models. Message types include HumanMessage, AIMessage, SystemMessage, FunctionMessage, and ToolMessage, among others.

4. Prompts

LangChain encapsulates a set of tools specifically used for prompt management, making it easier for us to format prompt content.

5. Output Parsers

As mentioned in the above image, after LangChain receives the text content returned by the large model (LLM), it can use dedicated output parsers to format the text content, such as parsing JSON or converting the output of LLM into a Python object.

5. Retrievers

To easily import private data into the large model (LLM) and improve the quality of model responses, LangChain encapsulates a retrieval framework (Retrievers) that facilitates the loading, segmentation, storage, and retrieval of document data.

6. Vector Stores

To support semantic similarity searches for private data, LangChain supports various vector databases.

7. Agents

Agents, usually referring to applications designed with the large model (LLM) as the decision engine, automatically call external systems and hardware devices to complete user tasks based on user input. It is a design pattern with the large model (LLM) at its core.

Application Scenarios

  • Chatbots: Building intelligent chat assistants, customer service chatbots, and conversational chatbots.
  • Knowledge Base Q&A: Providing open-domain question-answering services by integrating with knowledge graphs.
  • Intelligent Writing: Such as article writing, creative writing, and text summarization.