RAG框架搭建(基于Langchain+Ollama生成级RAG 聊天机器人)
目录
一 Ollama安装
Windows 系统安装
验证安装
二 Langchain安装
2.1 先创建一个虚拟环境
2.2 安装最新版 langchain
三 基于 Langchain+私有模型,构建一个生成级RAG 聊天机器人
3.1 初始化LLM
3.2 增强生成
3.3生成嵌入
3.4 生成并存储嵌入
一 Ollama安装
Ollama 是一个开源的大型语言模型(LLM)平台,旨在让用户能够轻松地在本地运行、管理和与大型语言模型进行交互。
Ollama 支持多种操作系统,包括 macOS、Windows、Linux 以及通过 Docker 容器运行。
Ollama 对硬件要求不高,旨在让用户能够轻松地在本地运行、管理和与大型语言模型进行交互。
- CPU:多核处理器(推荐 4 核或以上)。
- GPU:如果你计划运行大型模型或进行微调,推荐使用具有较高计算能力的 GPU(如 NVIDIA 的 CUDA 支持)。
- 内存:至少 8GB RAM,运行较大模型时推荐 16GB 或更高。
- 存储:需要足够的硬盘空间来存储预训练模型,通常需要 10GB 至数百 GB 的空间,具体取决于模型的大小。
- 软件要求:确保系统上安装了最新版本的 Python(如果打算使用 Python SDK)。
Ollama 官方下载地址:Download Ollama on macOSDownload Ollama for macOShttps://ollama.com/download
Windows 系统安装
打开浏览器,访问 Ollama 官方网站:Download Ollama on macOS,下载适用于 Windows 的安装程序。
下载地址为:https://ollama.com/download/OllamaSetup.exe。
下载完成后,双击安装程序并按照提示完成安装。
验证安装
打开命令提示符或 PowerShell,输入以下命令验证安装是否成功:
ollama --version
如果显示版本号,则说明安装成功。
二 Langchain安装
LangChain 是一个用于开发由语言模型驱动的应用程序的框架。它使得应用程序能够:
- 具有上下文感知能力:将语言模型连接到上下文来源(提示指令,少量的示例,需要回应的内容等)
- 具有推理能力:依赖语言模型进行推理(根据提供的上下文如何回答,采取什么行动等)
2.1 先创建一个虚拟环境
通过终端创建虚拟环境
打开集成终端:
- 按下
Ctrl +
(反引号)
或通过菜单Terminal > New Terminal
打开终端。创建虚拟环境:
python -m venv myenv # 创建名为 "myenv" 的虚拟环境
3.激活虚拟环境
# 进入虚拟环境的 Scripts 目录并激活
.\myenv\Scripts\activate.bat
2.2 安装最新版 langchain
pip install langchain
验证安装
pip show langchain
安装 langchain-community
pip install langchain_community
验证安装 pip list
到这环境基本搞定
三 基于 Langchain+私有模型,构建一个生成级RAG 聊天机器人
3.1 初始化LLM
首先初始化本地大型语言模型,使用Ollama在本地运行该模型。
使用LangChain库将所有内容连接在一起。
from langchain.llms import Ollama
# 初始化 Ollama 模型(确保 Ollama 已安装并运行)
llm = Ollama(model="llama3.2") # 替换为你需要的模型名称
# 测试调用
response = llm("Hello, Ollama!")
print(response)
输出
d:\ai\py\langchain001.py:3: LangChainDeprecationWarning: The class `Ollama` was deprecated in LangChain 0.3.1 and will be removed in 1.0.0. An updated version of the class exists in the :class:`~langchain-ollama package and should be used instead. To use it run `pip install -U :class:`~langchain-ollama` and import as `from :class:`~langchain_ollama import OllamaLLM``.
llm = Ollama(model="llama3.2") # 替换为你需要的模型名称
d:\ai\py\langchain001.py:5: LangChainDeprecationWarning: The method `BaseLLM.__call__` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
response = llm("Hello, Ollama!")
Nice to meet you! I'm here to help with any questions or topics you'd like to discuss. How's your day going so far?
from langchain_community.llms import Ollamafrom langchain.callbacks.manager import CallbackManagerfrom langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
llm = Ollama(model="llama3.2",callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
)llm.invoke("英伟达公布2024二季度营收多少亿美元?")
输出
d:\ai\py\langchain001.py:6: LangChainDeprecationWarning: The class `Ollama` was deprecated in LangChain 0.3.1 and will be removed in 1.0.0. An updated version of the class exists in the :class:`~langchain-ollama package and should be used instead. To use it run `pip install -U :class:`~langchain-ollama` and import as `from :class:`~langchain_ollama import OllamaLLM``.
llm = Ollama(
d:\ai\py\langchain001.py:6: DeprecationWarning: callback_manager is deprecated. Please use callbacks instead.
llm = Ollama(
我不知道你是在问什么年份的信息。但是,2020 年 7 月,NVIDIA 发佈了其第二季度 2020 的财
务报告,其中表明其营收为 5.9 亿美元。
3.2 增强生成
手动增强提示(即增强生成)
LangChain中我们可以使用以下代码构造一个提示,这里我们使用了Langchain的提示词模板:
from langchain.prompts import PromptTemplatetemplate = """你是一个机器人,使用提供的上下文来回答问题。如果你不知道答案,只需简单地说明你不知道。
{context}
Question: {input}"""prompt = PromptTemplate(template=template, input_variables=["context", "input"]
)# 格式化生成提示
formatted_prompt = prompt.format(context="英伟达公布2024财年第二财季季报。二季度营收135.07亿美元",input="英伟达公布2024二季度营收多少亿美元?"
)print(formatted_prompt)
输出
PS D:\ai\py> d:; cd 'd:\ai\py'; & 'd:\Anaconda3\envs\langchain01\python.exe' 'c:\Users\yurzai\.vscode\extensions\ms-python.debugpy-2025.6.0-win32-x64\bundled\libs\debugpy\launcher' '59423' '--' 'd:\ai\py\langchain001.py'
你是一个机器人,使用提供的上下文来回答问题。如果你不知道答案,只需简单地说明你不知道。
英伟达公布2024财年第二财季季报。二季度营收135.07亿美元
Question: 英伟达公布2024二季度营收多少亿美元?
3.3生成嵌入
创建 PDF页面的嵌入。
检索页面 PyPDFLoader。
from langchain.text_splitter import RecursiveCharacterTextSplitterfrom langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader(file_path="D:/ai/py/Research_and_application_of_real-time_control_system_for_construction_quality_of_digital_dynamic_compaction.pdf")docs = loader.load()print(len(docs))print(docs[1].page_content)
输出
s\yurzai\.vscode\extensions\ms-python.debugpy-2025.6.0-win32-x64\bundled\libs\debugpy\launcher' '59686' '--' 'd:\ai\py\langchain001.py'
5
parameters and did not see the engineering application of the
monitoring system. Zhan jinlin [9] put forward the idea of the
construction information management system based on GPS to
realize dynamic monitoring of the dynamic compaction
construction process, but also failed to see the engineering
application of this information management system. Wang
Zhongming [10] proposed a platform for the dynamic
compaction quality supervision based on analysis of the
transient Rayleigh wave and the internet detection technology,
to control the dynamic compaction quality of dispersive
projects all over the country. Nevertheless, the platform was
not a compaction construction process supervision platform in
some sense. The technical specification for dynamic
compaction ground treatment in China [11] stipulated that the
construction process of groundwork using dynamic compaction
should be informatized, but a complete construction quality
control scheme covering construction monitoring, data analysis,
early warning and construction navigation has not been
established in China. Related technologies urgently need to be
perfected in engineering applications.
III. OVERALL TECHNOLOGICAL SCHEME
According to the construction quality control procedures
and processes of the dynamic compaction, the Global
Navigation Satellite System (GNSS), wireless network
technology, cloud computing technology and sensor
technology are used. This paper proposes the overall
architecture of the real-time control system for digital dynamic
compaction as shown in Fig. 1, which mainly includes
intelligent monitoring terminal, cloud computing terminal,
wireless communication network, mobile monitoring terminal
and integrated information management terminal (online
management website).
By installing a GNSS antenna on the roof of cab and the
hook of rammer on the machine respectively, the three-
dimensional position information and orientation of the
rammer can be obtained in real time. At the same time, using
the tension sensor installed on the wire rope of the hanging
hammer, the stress state of the wire rope is measured and used
for judging whether the hammer is lifted. After the three-
dimensional position coordinates and stress state collected by
the above-mentioned monitoring terminal are fused in the
integrated controller, the collated data is sent to the database on
the cloud computing terminal through the wireless network.
The cloud-based application program calls the data from the
database and calculates the dynamic construction parameters
such as number of tamping, hammer drop distance, last two-
click settlement, tamping point location and tamping machine
orientation in real time. The calculated parameters are
displayed on the cockpit display terminal, mobile monitoring
terminal and comprehensive management information terminal,
synchronously. Then these construction parameters are
automatically compared with the presupposed construction
control standards to determine whether the number of tamping,
the last two-click settlement and the tamping position
deviations are up to the standard. If there is a deviation beyond
the standard, the background calculation program
automatically calculates the direction and the distance in which
the tamper hammer should be adjusted according to the current
position of the defect, and transmits the command to the
displayer of the cab for the real-time construction navigation
and on-site construction adjustment.
Base station
Monitor terminal of dynamic tamping machine
Internet
GNSS satellite
Tension sensor
GNSS antenna
Locating
information
Machine and
rammer position
coordinates
Cable Stress
status
Cloud computing
Average settlement/
tampping point offset
Cab display
terminal
Integrated information
management terminal
Supervision mobile
monitoring terminal
Tampping number/
Lift distence
Construction
navigation
Integrated information
management terminalFig. 1 Overall technical architecture of the system
IV. KEY TECHNICAL ISSUES AND SOLUTIONS
A. Data Acquisition and Transmission
For the quality control of dynamic construction, the most
important construction information is the real-time location of
hammer. In order to collect the real-time hammer position, this
paper used an overall solution of GNSS location technique plus
tension sensor technique. The real time kinematic (RTK)
positioning technology based on GNSS is used to collect the
hammer position every second, because a GNSS rover is
installed on the side of the rammer hook. The tension sensor is
installed on the cable to monitor the stress status of the cable in
real time. When the force status of the cable changed, the
sensor will send a signal to the integrated controller. The
position information collected by the satellite receiver and the
stress information collected by the tension sensor are fused in
the vehicular integrated controller and send the construction
data to the cloud computing procedure through the wireless
data transmission module. The follow-up data presentation and
analysis are carried out by the cloud service. All devices are
powered by the onboard power supply.
B. Data fusion and analysis
The rammer hook position coordinate is defined as (x, y, z),
and the spatial position coordinate of rammer is defined as ( x’,
y’, z’). At the moment that the hammer is lifted for the nth time,
the tension sensor triggers the maximum threshold and sends
an instruction to the integrated controller. The position of the
GNSS antenna ( x0,n, y0,n, z0,n) is recorded at this time and the
lowest position of hammer (x’0,n, y’0,n, z’0,n) is converted by the
geometrical relationship of hammer (i.e., the space position
where the hammer tamped the foundation for the n-1th time).
When the hammer is raised to the highest point and the rammer
hook is released, the force sensor is subjected to a sudden
decrease in force calibration and triggered a minimum
threshold. At this time, the tension sensor also sends an
instruction to the integrated controller to record the antenna
position coordinate (x1,n, y1,n, z1,n), and then the highest position
coordinate ( x’1,n, y’1,n, z’1,n) of hammer is converted by the
geometrical relationship. The spatial position of the hammer
after the nth drop was defined as (x’0,n+1, y’0,n+1, z’0,n+1), then the
1125
Authorized licensed use limited to: University of Chinese Academy of SciencesCAS. Downloaded on May 04,2025 at 07:05:49 UTC from IEEE Xplore. Restrictions apply.
将把页面分 成块。实现此目的的一种方法是使用 RecursiveCharacterTextSplitter 遍历文本并提取一定长度的 块,每个块之间可以选择重叠:
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000,chunk_overlap = 200,length_function = len,is_separator_regex = False,
)data = text_splitter.split_documents(docs)# 验证第一个分块
if len(data) > 0:print("第一个分块的文本内容:")print(data[0].page_content[:500]) # 截取前500字符避免过长输出print("分块数量:", len(data))
else:print("未成功分块,请检查文档内容!")
输出
loaded on May 04,2025 at 07:05:49 UTC from IEEE Xplore. Restrictions apply.
第一个分块的文本内容:
Research and application of real-time control system
for construction quality of digital dynamic
compaction
Zilong Li1*, Lei Liu2, Yichao Sun3, Ruixin Ma1, Guangshuang Ge1
1. Tianjin Research Institute of Water Transport Engineering, Tianjin, China
2. Shandong SWINFO Co. Ltd., Tai’an, China
3. Shandong Bohai Bay Port Group Co. Ltd., Jinan, China
zilonglibaifoshan@163.com, rokeyliu@126.com, 15065310890@163.com, geguangshuang@126.com, maruixin1990@sina.comAbstract—Against the drawb
分块数量: 35
3.4 生成并存储嵌入
使用LangChain的Chroma向量存储来保存文档的嵌入向量.
from langchain.vectorstores import Chroma
from langchain_community.embeddings.fastembed import FastEmbedEmbeddings# 1. 初始化嵌入模型
embeddings = FastEmbedEmbeddings(model_name="BAAI/bge-base-en-v1.5")# 2. 准备文档数据(示例)
data = [{"page_content": "苹果公司发布2023年Q2财报,营收同比增长10%。", "metadata": {"source": "财报"}},{"page_content": "英伟达宣布推出新一代GPU架构Blackwell。", "metadata": {"source": "新闻"}},
]# 3. 创建 Chroma 向量存储
store = Chroma.from_documents(data=data,embedding=embeddings,ids=[f"{item['metadata']['source']}-{i}" for i, item in enumerate(data)],collection_name="Tech-Reports",persist_directory="db",
)# 4. 验证存储
print(f"向量库中文档数量: {len(store.index_to_docstore_id)}")
输出
本地会生成一个db文件夹