当前位置：首页 > news >正文

Qdrant Filtering：must / should / must_not 全解析（含 Python 实操）

news 2025/10/2 5:48:50

在向量搜索中，过滤（Filtering）是保证结果精准性和业务契合度的关键手段。
Qdrant 的过滤机制不仅能在向量相似度检索的基础上叠加结构化条件，还提供了灵活的布尔逻辑组合，让我们可以像写数据库查询一样，精准控制搜索范围。

本文将深入解析 Qdrant 的过滤规则，并结合 Python 实例演示 must、should、must_not 的用法。

1. 过滤机制的意义

向量检索只考虑语义相似度，但在实际业务中往往需要额外的约束：

电商：只展示“价格低于 1000 元”的笔记本
招聘：只匹配“3 年以上经验”的候选人
地图搜索：只返回“当前城市”的餐厅

Qdrant 的 Filtering 就是为这些结构化条件而生的。

2. 三大核心关键字

2.1 must — 必须满足的条件（AND）

定义：列表中的所有条件都必须成立。
逻辑：等价于 AND。

JSON 示例：

"filter": {"must": [{ "key": "city", "match": { "value": "London" } },{ "key": "price", "range": { "lte": 1000 } },{ "key": "brand", "match": { "value": "Apple" } }]
}

解释：只返回满足

city = London 且 price ≤ 1000 且 brand = Apple 的结果。

Python 实操：

from qdrant_client import QdrantClient
from qdrant_client.models import Filter, FieldCondition, MatchValue, Rangeclient = QdrantClient("localhost", port=6333)search_result = client.search(collection_name="products",query_vector=[0.1, 0.2, 0.3, 0.4],limit=5,query_filter=Filter(must=[FieldCondition(key="city", match=MatchValue(value="London")),FieldCondition(key="price", range=Range(lte=1000)),FieldCondition(key="brand", match=MatchValue(value="Apple"))])
)print(search_result)

2.2 should — 可选条件（OR / 排序加权）

定义：
- 有 must 时：should 条件不满足也会返回，但满足的结果会排前。
- 无 must 时：should 至少要有一个条件成立（OR 逻辑）。

JSON 示例：must + should

"filter": {"must": [{ "key": "city", "match": { "value": "London" } }],"should": [{ "key": "brand", "match": { "value": "Apple" } }]
}

解释：必须在伦敦；Apple 品牌排前，不是 Apple 也会返回。

Python 实操：

from qdrant_client.models import ShouldConditionsearch_result = client.search(collection_name="products",query_vector=[0.1, 0.2, 0.3, 0.4],limit=5,query_filter=Filter(must=[FieldCondition(key="city", match=MatchValue(value="London"))],should=[FieldCondition(key="brand", match=MatchValue(value="Apple"))])
)

2.3 must_not — 排除条件（NOT）

定义：列表中的条件必须全部不成立。
逻辑：等价于 NOT。

JSON 示例：

"filter": {"must_not": [{ "key": "brand", "match": { "value": "Asus" } }]
}

解释：排除 Asus 品牌。

Python 实操：

search_result = client.search(collection_name="products",query_vector=[0.1, 0.2, 0.3, 0.4],limit=5,query_filter=Filter(must_not=[FieldCondition(key="brand", match=MatchValue(value="Asus"))])
)

3. min_should 高级用法

min_should 可要求 should 中必须满足最少数量。

JSON 示例：至少满足 2 个特性

"filter": {"should": [{ "key": "feature", "match": { "value": "touchscreen" } },{ "key": "feature", "match": { "value": "ssd" } },{ "key": "feature", "match": { "value": "backlit_keyboard" } }],"min_should": {"min_count": 2}
}

Python 实操：

from qdrant_client.models import Filter, FieldCondition, MatchValue, MinShouldsearch_result = client.search(collection_name="products",query_vector=[0.1, 0.2, 0.3, 0.4],limit=5,query_filter=Filter(should=[FieldCondition(key="feature", match=MatchValue(value="touchscreen")),FieldCondition(key="feature", match=MatchValue(value="ssd")),FieldCondition(key="feature", match=MatchValue(value="backlit_keyboard"))],min_should=MinShould(min_count=2))
)