当前位置：首页 > news >正文

RAG实践：Routing机制与Query Construction策略

news 2025/8/5 7:41:31

Routing机制与Query Construction策略

前言
Routing
- Logical Routing
- - ChatOpenAI
  - Structured
  - Routing Datasource
  - Conclusion
- Semantic Routing
- - Embedding & LLM
  - Prompt
  - Rounting Prompt
  - Conclusion
Query Construction
- Grab Youtube video information
- Structured
- Prompt
Github
References

前言

本文引用上一篇博客的作法。在本地开启一个代理服务器，然后使用OpenAI的ChatOpenAI作为聊天接口客户端，使其发送的请求链接到我们的本地服务器。

Routing

在传统的 RAG 架构中，所有查询都走统一的 Retriever和 Prompt 模板，这在多源数据或多任务系统中存在检索结果不相关、内容不精准、用户意图模糊等局限性。为了解决这一问题，Routing机制可以根据用户提出的问题，智能地路由到最相关的知识源或处理流程中，以提升回答的精准性与效率。

Logical Routing

ChatOpenAI

如前言所述，使用ChatOpenAI聊天接口客户端，但不适用GPT模型。

os.environ['VLLM_USE_MODELSCOPE'] = 'True'
chat = ChatOpenAI(model='Qwen/Qwen3-0.6B',openai_api_key="EMPTY",openai_api_base='http://localhost:8000/v1',stop=['<|im_end|>'],temperature=0
)

在这里插入图片描述
Fig .1 Logical Routing framework diagram

Structured

在Prompt中，指示LLM根据当前编程语言选择最相关的数据源。然后通过|管道运算符传入with_structured_output。其中， with_structured_output的作用是让大模型生成和RouteQuery数据格式一样的数据结构。也称结构化数据。

class RouteQuery(BaseModel):"""Route a user query to the most relevant datasource."""datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(...,description="Given a user question choose which datasource would be most relevant for answering their question",)# 生成结构化对象；其目的是让llm严格地按照RouteQuery结构体格式化为对应的JSON格式，并自动解析成Python对象。”
"""
数据结构：
RouteQuery(datasource='python_docs'  # 或 'js_docs' 或 'golang_docs'
)
JSON格式：
{"datasource": "python_docs"
}
"""
structured_llm = chat.with_structured_output(RouteQuery)# Prompt
system = """You are an expert at routing a user question to the appropriate data source.Based on the programming language the question is referring to, route it to the relevant data source."""prompt = ChatPromptTemplate.from_messages([("system", system),("human", "{question}"),]
)# Define router
router = prompt | structured_llm

当运行这一部分之后，LLM只会生成["python_docs", "js_docs", "golang_docs"]三者中的其中一个，因为datasource: Literal["python_docs", "js_docs", "golang_docs"] 指定了它们三个作为候选值。输出的结果如下所示。

"""
数据结构：
RouteQuery(datasource='python_docs'  # 或 'js_docs' 或 'golang_docs'
)
JSON格式：
{"datasource": "python_docs"
}
"""

在这里插入图片描述
Fig .2 Data structured flow chart

Routing Datasource

在Question中给出需要判断的编程语言。然后经过上一节Structured的运算后，得到一个结构化数据。最终只需要调用choose_route匹配对应数据源即可。

question = """Why doesn't the following code work:from langchain_core.prompts import ChatPromptTemplateprompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""# result = router.invoke({"question": question})
def choose_route(result):if "python_docs" in result.datasource.lower():### Logic herereturn "chain for python_docs"elif "js_docs" in result.datasource.lower():### Logic herereturn "chain for js_docs"else:### Logic herereturn "golang_docs"full_chain = router | RunnableLambda(choose_route)full_chain.invoke({"question": question})

Conclusion

上述内容通过一个甄别编程语言的案例，讲述了如何结构化数据，以及如何根据结构化后的数据选择对应的数据源。这有利于我们在实际应用中，根据用户所提问题，Rounting到最相关的数据源或向量数据库中，可以极大地提升召回率。

Semantic Routing

Embedding & LLM

定义ModelScope社区中开源的Embedding和Text Generation模型。

embedding = ModelScopeEmbeddings(model_id='iic/nlp_corom_sentence-embedding_english-base')
# 使用vllm部署OpenAI Serve，然后使用ChatOpenAI
os.environ['VLLM_USE_MODELSCOPE'] = 'True'
chat = ChatOpenAI(model='Qwen/Qwen3-0.6B',openai_api_key="EMPTY",openai_api_base='http://localhost:8000/v1',stop=['<|im_end|>'],temperature=0
)

Prompt

定义两个prompt template，并对其向量化。用于后续根据用户所提问题，选择合适的prompt。

physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.Here is a question:
{query}"""math_template = """You are a very good mathematician. You are great at answering math questions. \
You are so good because you are able to break down hard problems into their component parts, \
answer the component parts, and then put them together to answer the broader question.Here is a question:
{query}"""prompt_templates = [physics_template, math_template]
prompt_embeddings = embedding.embed_documents(prompt_templates)

在这里插入图片描述
Fig .3 Semantic Routing framework diagram

Rounting Prompt

对用户查询进行向量化，便于计算用户查询和prompt template之间的余弦相似度，根据相似度最高的下标，获取对应的prompt。最终交予LLM处理。

# 根据计算余弦相似度，得到输入`query`和`templates`中相似度最高的一个`template`
def prompt_router(input):# 向量化 `query`query_embedding = embedding.embed_query(input["query"])# 计算输入`query`和`prompt`之间的余弦相似度similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]# 以相似度最高的下标获取对应的templatemost_similar = prompt_templates[similarity.argmax()]# Chosen promptprint("Using MATH" if most_similar == math_template else "Using PHYSICS")return PromptTemplate.from_template(most_similar)# RunnablePassthrough直接返回原值
chain = ({"query": RunnablePassthrough()}| RunnableLambda(prompt_router)| chat| StrOutputParser()
)answer = chain.invoke("What's a black hole")
print(answer)

Conclusion

上述内容描述了一种根据用户查询动态匹配Prompt的策略。

Query Construction

Grab Youtube video information

国内访问下述内容，可能会出现urllib.error.HTTPError: HTTP Error 400: Bad Request异常。为了解决这一问题，我们通过另外一种方式，同样可以构造datasource。

docs = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=pbAd8O1Lvm4", add_video_info=False
).load()print(docs[0].metadata)

通过subprocess.run执行一个脚本，并使用yt-dlp下载视频信息，然后以JSON数据格式输出。最后，根据数据信息构造我们所需的datasource即可。

result = subprocess.run(["yt-dlp", "--dump-json", f"https://www.youtube.com/watch?v=pbAd8O1Lvm4"],capture_output=True, text=True)
video_info = json.loads(result.stdout)metadata = {"source": 'pbAd8O1Lvm4',"title": video_info.get("title", "Unknown"),"description": video_info.get("description", "Unknown"),"view_count": video_info.get("view_count", 0),"thumbnail_url": video_info.get("thumbnail", ""),"publish_date": datetime.strptime(video_info.get("upload_date", "19700101"), "%Y%m%d").strftime("%Y-%m-%d 00:00:00"),"length": video_info.get("duration", 0),"author": video_info.get("uploader", "Unknown"),
}

Structured

下列定义了一个结构化搜索查询模式，与Routing中的Structured一样。其目的是将自然语言转为结构化搜索查询。

class TutorialSearch(BaseModel):"""Search over a database of tutorial videos about a software library."""content_search: str = Field(...,description="Similarity search query applied to video transcripts.",)title_search: str = Field(...,description=("Alternate version of the content search query to apply to video titles. ""Should be succinct and only include key words that could be in a video ""title."),)min_view_count: Optional[int] = Field(None,description="Minimum view count filter, inclusive. Only use if explicitly specified.",)max_view_count: Optional[int] = Field(None,description="Maximum view count filter, exclusive. Only use if explicitly specified.",)earliest_publish_date: Optional[date] = Field(None,description="Earliest publish date filter, inclusive. Only use if explicitly specified.",)latest_publish_date: Optional[date] = Field(None,description="Latest publish date filter, exclusive. Only use if explicitly specified.",)min_length_sec: Optional[int] = Field(None,description="Minimum video length in seconds, inclusive. Only use if explicitly specified.",)max_length_sec: Optional[int] = Field(None,description="Maximum video length in seconds, exclusive. Only use if explicitly specified.",)def pretty_print(self) -> None:for field in self.__fields__:if getattr(self, field) is not None and getattr(self, field) != getattr(self.__fields__[field], "default", None):print(f"{field}: {getattr(self, field)}")

在这里插入图片描述
Fig .4 Data structured flow chart

Prompt

下述Prompt引导模型将用户的自然语言问题转化为结构化的数据库查询指令。

system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a database query optimized to retrieve the most relevant results.If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages([("system", system),("human", "{question}"),]
)# 根据问题语义，将问题中涉及的内容映射到 metadata 的结构化字段中。
structured_llm = llm.with_structured_output(TutorialSearch)
query_analyzer = prompt | structured_llm

对于用户的提问，则将此Question根据Prompt进行结构化。其目的是将用户提问中所包含的单词映射到datasource中合适的字段。

query_analyzer.invoke({"question": "rag from scratch"}).pretty_print()

上述代码运行之后，LLM会根据语义自动构造合适的结构化数据。

content_search: rag from scratch
title_search: rag from scratch

再例如，Question是2023 年在 Chat Langchain 上发布的视频，其中很明显日期应该对应datasource字段中的包含时间日期的对应项，例如earliest_publish_date，latest_publish_date。

query_analyzer.invoke({"question": "videos on chat langchain published in 2023"}
).pretty_print()

content_search: chat langchain
title_search: 2023
earliest_publish_date: 2023-01-01
latest_publish_date: 2024-01-01

Github

https://github.com/FranzLiszt-1847/LLM

References

[1] https://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb

查看全文

http://www.dtcms.com/a/253413.html

Gemini 2.5 Flash-Lite 新版解析：与 Pro 和 Flash 版本的性能对比

JavaEE-Spring-IoCDI

深入探索 UnoCSS：下一代原子化 CSS 引擎

HTML 与 CSS 的布局机制（盒模型、盒子定位、浮动、Flexbox、Grid）问题总结大全

股指期货套期保值是利好还是利空？

数组和指针

django 获取 filter后的某一个属性的list

阿里云主机自动 HTTPS 证书部署踩坑实录

JavaScript 循环方式：全面解析与性能对比

Java求职者面试题详解：核心语言、计算机基础与源码原理

爬虫技术：数据挖掘的深度探索与实践应用

C++/OpenCV 图像预处理与 PaddleOCR 结合进行高效字符识别

计算无线电波在大气中传播衰减的算法

UL/CE双认证！光宝MOC3052-A双向可控硅输出光耦智能家居/工业控制必备！

Tailwind Css V4 在vite安装流程

《Effective Python》第九章并发与并行——使用 Queue 实现并发重构

数据结构--栈和队列

crackme010

鼎捷T100开发语言-Genero FGL 终极技术手册

求LCA（倍增/DFS序/重链剖分）Go语言

UE5 游戏模板 —— TopDownGame 俯视角游戏

XML映射文件-辅助配置

Greenplum/PostgreSQL pg_hba.conf 认证方法详解

PCIe接口卡设计原理图：124-基于XC7Z015的PCIe低速扩展底板

Zephyr 高阶实践：彻底讲透 west 构建系统、模块管理与跨平台 CI/CD 配置

Arduino入门教程：10、屏幕显示

基于SVM和dbs的孤岛检测算法

如何添加项目属性表(.props)

TradingAgents：基于多智能体的大型语言模型（LLM）金融交易框架

利用GMT绘制逐月的GRACE趋势堆叠图

Routing机制与Query Construction策略

前言

Routing

Logical Routing

ChatOpenAI

Structured

Routing Datasource

Conclusion

Semantic Routing

Embedding & LLM

Prompt

Rounting Prompt

Conclusion

Query Construction

Grab Youtube video information

Structured

Prompt

Github

References

相关文章：