【RAG】优化query查询效果的几种处理
查询增强
query_rewrite_template = """You are an AI assistant tasked with reformulating user queries to improve retrieval in a RAG system.
Given the original query, rewrite it to be more specific, detailed, and likely to retrieve relevant information.
Original query: {original_query}Rewritten query:"""
使query更加具体。
下面一种方法与这个增强相似,本质上也是丰富query
假设文档嵌入技术
传统的检索方法往往难以解决短查询与更长、更详细文档之间的语义差异问题。假设文档嵌入通过将查询扩展为完整的假设性文档来解决这一问题,这有可能通过使查询的表示与向量空间中的文档表示更加相似来提高检索的相关性。这种技术在那些理解查询意图和上下文至关重要的领域中可能具有极大的价值,例如法律研究、学术文献综述或高级信息检索系统。
HyDE是一种创新的方法,它将查询问题转化为包含答案的假设性文档,旨在缩小查询与文档在向量空间中的分布差距。
回溯式提问
相对于让query更加detailed,也可以使他更加笼统
```bash
step_back_template = """You are an AI assistant tasked with generating broader, more general queries to improve context retrieval in a RAG system.
Given the original query, generate a step-back query that is more general and can help retrieve relevant background information.
Original query: {original_query}
Step-back query:"""
```
## 拆解为子问题
```bash
subquery_decomposition_template = """You are an AI assistant tasked with breaking down complex queries into simpler sub-queries for a RAG system.
Given the original query, decompose it into 2-4 simpler sub-queries that, when answered together, would provide a comprehensive response to the original query.
Original query: {original_query}
example: What are the impacts of climate change on the environment?
Sub-queries:
1. What are the impacts of climate change on biodiversity?
2. How does climate change affect the oceans?
3. What are the effects of climate change on agriculture?
4. What are the impacts of climate change on human health?"""
```
提示词的处理只有两种方式,要么就是增强,添加更多细节和上下文,要么就是压缩,只抽取关键字。个人觉得查询效果需要看数据源来进行调整。因为有时数据多了,只用关键字可能查出来的内容也很笼统。而有时数据少了,增强后的query反而语义变复杂了检索不到。
## 自适应检索策略
不同类型的提问采用不同的策略




