ES02-常用API
ES02-常用API
文章目录
- ES02-常用API
- 1-参考网址
- 2-知识总结
- 3-Index操作
- 1-创建索引(Create Index)
- 2-获取索引元信息(Get Index)
- 3-删除索引(Delete Index)
- 4-索引是否存在(Exists)
- 5-关闭 / 打开索引(Close / Open)
- 6-刷新索引(Refresh)
- 7-Flush(持久化 translog → segment)
- 8-Force Merge(段合并,优化磁盘 & 查询)
- 9-清空索引数据(Delete By Query)
- 10-克隆索引(Clone Index)
- 11-收缩索引(Shrink Index)
- 12-Split 索引(Split)
- 13-Rollover(滚动别名创建新索引)
- 14-模板(Index Template)与 Data Stream
- 15-索引别名(Alias)管理
- 16-设置动态更新(Dynamic Setting)
- 17-索引分析器测试(Analyze API)
- 18-索引统计(Stats)
- 19-索引恢复(Recovery)
- 20-索引分片存储(Shard Stores)
- 小结
- 4-DOC操作
- 1-创建文档(Create Document)
- 2-获取文档(Get Document)
- 3-文档是否存在(HEAD)
- 4-更新整篇文档(Index API 覆盖)
- 5-局部更新(Update API)
- 5.1-脚本更新(Script Update)
- 5.2-Upsert(文档不存在则插入)
- 6-删除文档(Delete Document)
- 7-批量写入(_bulk)
- 8-批量获取(_mget)
- 9-批量更新/删除(_update_by_query / _delete_by_query)
- 9.1-条件更新
- 9.2-条件删除
- 10-Reindex(跨索引复制/转换文档)
- 11-乐观并发控制(_seq_no/_primary_term)
- 12-文档计数(_count)
- 13-Term Vectors(查看分词、tf-idf 等)
- 14-解释查询(Explain)
- 15-获取文档的源字段过滤(_source)
- 16-实时获取 vs 刷新可见性
- 17-版本号(_version)与外部版本
- 18-文档路由(Routing)
- 小结
- 5-query操作
- 1. 基本查询结构
- 2. 查询全部 & 分页
- 3. 全文检索
- 4. 精准匹配
- 5. 布尔组合查询
- 6. 存在 / 缺失
- 7. 前缀 / 通配符 / 正则
- 8. 模糊 / 拼写容错
- 9. IDs 查询
- 10. Nested 查询(嵌套对象)
- 11. 高亮
- 12. 排序
- 13. 聚合查询(Aggregations)
- 14. 搜索模板(Search Template)
- 15. Scroll / Search After(深分页)
- 16. Profile & Explain(调优)
- 17. Count API(仅计数)
- 18. Validate API(验证语法)
- 19. Multi-Search(_msearch)
- 20. 查询结果过滤(字段、高亮、排序、分页)综合示例
- 小结
- 6-Aggregations聚合操作
- 1. 指标聚合(Metrics)
- 1.1 单值:AVG / MAX / MIN / SUM / VALUE_COUNT
- 1.2 多值:stats / extended_stats
- 1.3 百分位 / 中位数
- 2. 桶聚合(Buckets)
- 2.1 terms 分组(按 keyword 字段)
- 2.2 range 区间
- 2.3 date_histogram 时间直方图
- 2.4 geo_distance 距离桶
- 3. 嵌套聚合(Pipeline Buckets)
- 3.1 在桶内再算指标
- 3.2 Top Hits(每组取最新 N 条)
- 4. Nested 聚合
- 4.1 先下钻 nested,再聚合
- 5. 管道聚合(Pipeline)
- 5.1 bucket_script:两桶相除算占比
- 6. 采样聚合(Sampler / Diversified Sampler)
- 7. 统计聚合 Cardinality(去重计数)
- 8. Matrix 统计(数值字段相关性)
- 9. 聚合排序 & 分页
- 9.1 按子聚合结果排序
- 9.2 Composite 聚合(流式全量桶)
- 10. 聚合结果后过滤(post_filter)
- 小结(一句话背下来)
1-参考网址
- elasticsearch官网地址:https://www.elastic.co/
- 安装elasticsearch9.0.0参考:https://zhuanlan.zhihu.com/p/1920780524991017021
- 安装elasticsearch9.0.0参考:http://www.rhkb.cn/news/51907.html
2-知识总结
- 1)Index操作-增删改查
- 2)DOC操作-增删改查
- 3)query操作
- 4)Aggregations聚合
3-Index操作
下面把 Elasticsearch 中“Index(索引)”这一层级的核心操作一次性梳理出来,并给出可直接复制到 Kibana Dev-Tools、curl 或任何 HTTP 客户端中运行的完整请求示例。所有示例均基于 ES 8.x 默认无鉴权(http://localhost:9200)进行演示,可按需替换地址、用户名、密码或 CA 证书。
1-创建索引(Create Index)
核心要点
• 索引名必须小写,不能包含 /、*、?、"、<、>、|、空格、,、# 等字符。
• 如果不显式创建而直接写文档,ES 会按默认 template 自动建索引(auto-create)。
• 可一次性指定 mapping、settings、aliases。
示例:显式创建一个带分片、副本、IK 中文分词器、mapping 与别名的索引
PUT /shop_v1
{"settings": {"number_of_shards": 3,"number_of_replicas": 1,"analysis": {"analyzer": {"ik_max_custom": {"type": "custom","tokenizer": "ik_max_word","filter": ["lowercase"]}}}},"mappings": {"dynamic": "strict","properties": {"title": {"type": "text","analyzer": "ik_max_custom","fields": {"keyword": { "type": "keyword" }}},"price": { "type": "double" },"created": { "type": "date" }}},"aliases": {"shop": {}}
}
2-获取索引元信息(Get Index)
GET /shop_v1
可查看 settings、mappings、aliases 等完整元信息,也可只取部分:
GET /shop_v1/_settings
GET /shop_v1/_mapping
GET /shop_v1/_alias
3-删除索引(Delete Index)
DELETE /shop_v1
危险操作,生产环境务必加最小权限 + index.lifecycle 管理。
4-索引是否存在(Exists)
HEAD /shop_v1
返回 200 表示存在,404 表示不存在。常用于脚本前置判断。
5-关闭 / 打开索引(Close / Open)
POST /shop_v1/_close
POST /shop_v1/_open
关闭后几乎不消耗资源,但禁止读写,可用于临时下线历史索引。
6-刷新索引(Refresh)
POST /shop_v1/_refresh
把内存 buffer 刷成新 segment,使文档可搜索。高并发写入时,可设 index.refresh_interval=-1 先关闭自动 refresh,再手动 refresh。
7-Flush(持久化 translog → segment)
POST /shop_v1/_flush
刷新 + 清理 translog;快照前建议手动 flush,可减少恢复时间。
8-Force Merge(段合并,优化磁盘 & 查询)
POST /shop_v1/_forcemerge?max_num_segments=1
把多个 segment 合并成 1 个,减少文件句柄;大索引在只读场景下常用。
9-清空索引数据(Delete By Query)
POST /shop_v1/_delete_by_query
{"query": { "match_all": {} }
}
注意:不会删除 mapping/settings,只是清空文档。
10-克隆索引(Clone Index)
场景:需要快速对现有索引做结构调整,而原索引很大无法 reindex。
前置条件:原索引必须设为只读且所有分片在同一节点组。
PUT /shop_v1/_settings
{"settings": {"index.blocks.write": true}
}POST /shop_v1/_clone/shop_v2
{"settings": {"index.number_of_replicas": 0}
}
11-收缩索引(Shrink Index)
把 5 个主分片 → 1 个主分片,节省资源。
前置条件同 clone。
POST /shop_v1/_shrink/shop_v1_shrinked?copy_settings=true
{"settings": { "index.number_of_replicas": 0 },"aliases": { "shop_read": {} }
}
12-Split 索引(Split)
与 shrink 相反,把 1 个主分片 → N 个,需满足 index.number_of_routing_shards 因子。
POST /shop_v1/_split/shop_v1_split
{
“settings”: { “index.number_of_shards”: 6 }
}
13-Rollover(滚动别名创建新索引)
场景:按日期或文档量滚动创建新索引(日志场景)。
先创建起始索引:
PUT /logs-000001
{
“aliases”: { “logs_write”: {} }
}
然后执行:
POST /logs_write/_rollover
{
“conditions”: {
“max_age”: “7d”,
“max_docs”: 1000000
}
}
满足任一条件时,ES 自动生成 logs-000002 并把别名 logs_write 指向新索引。
14-模板(Index Template)与 Data Stream
示例:创建匹配 logs-* 的模板,并启用 data stream。
PUT _index_template/logs_template
{"index_patterns": ["logs-*"],"template": {"settings": {"number_of_shards": 1},"mappings": {"properties": {"@timestamp": { "type": "date" }}}},"data_stream": {}
}
之后直接 PUT _data_stream/logs-app 即可。
15-索引别名(Alias)管理
原子切换读写别名 → 实现零停机重建索引。
POST _aliases
{
“actions”: [
{ “remove”: { “index”: “shop_v1”, “alias”: “shop” } },
{ “add”: { “index”: “shop_v2”, “alias”: “shop” } }
]
}
16-设置动态更新(Dynamic Setting)
PUT /shop_v1/_settings
{
“index”: {
“number_of_replicas”: 2,
“refresh_interval”: “30s”
}
}
17-索引分析器测试(Analyze API)
POST /shop_v1/_analyze
{
“analyzer”: “ik_max_custom”,
“text”: “小米手机”
}
18-索引统计(Stats)
GET /shop_v1/_stats
GET /_stats/store,docs
19-索引恢复(Recovery)
GET /shop_v1/_recovery
查看节点间分片恢复进度。
20-索引分片存储(Shard Stores)
GET /shop_v1/_shard_stores
排查分片分配问题。
小结
以上为日常开发/运维中高频使用、且与“索引”本身直接相关的核心操作。
• 创建 → 查询 → 调整 → 清理 是主链路。
• 进阶场景(clone、shrink、split、rollover、data stream)可按业务规模逐步引入。
• 所有示例可直接粘贴到 Dev-Tools 或 curl -X ‘http://localhost:9200/’ -H ‘Content-Type: application/json’ -d’’ 中验证。
4-DOC操作
下面把 Elasticsearch 中“文档(Document)”这一层级的核心操作一次性梳理出来,并给出可直接复制到 Kibana Dev-Tools、curl、Postman 里运行的完整 JSON 请求示例。
所有示例默认访问 http://localhost:9200,索引名用 shop,如无特殊说明返回结果均为 201/200。
1-创建文档(Create Document)
• 指定文档 ID(幂等,若已存在则 409)
PUT /shop/_doc/1001
{"title": "小米手机 14 Pro","price": 4999,"tags": ["手机","小米"],"created": "2024-08-27T12:00:00Z"
}
• 不指定 ID(ES 自动生成 20 位字符串 ID)
POST /shop/_doc
{"title": "iPhone 15","price": 5999
}
2-获取文档(Get Document)
GET /shop/_doc/1001
可选参数:
GET /shop/_source/1001 (只返回 _source,不带元数据)
GET /shop/_doc/1001?_source=title,price&pretty=true
3-文档是否存在(HEAD)
HEAD /shop/_doc/1001
返回 200 存在,404 不存在。
4-更新整篇文档(Index API 覆盖)
PUT /shop/_doc/1001
{"title": "小米手机 14 Pro 钛金属版","price": 5299,"tags": ["手机","小米"],"created": "2024-08-27T12:00:00Z"
}
注意:整篇替换,缺失字段会被删除。
5-局部更新(Update API)
POST /shop/_update/1001
{"doc": {"price": 5099,"stock": 100}
}
• 若字段不存在则新增;
• 支持脚本更新(见 5.1)。
5.1-脚本更新(Script Update)
POST /shop/_update/1001
{"script": {"source": "ctx._source.price -= params.discount","params": { "discount": 200 }}
}
5.2-Upsert(文档不存在则插入)
POST /shop/_update/1002
{"doc": { "price": 3999 },"doc_as_upsert": true
}
6-删除文档(Delete Document)
DELETE /shop/_doc/1001
7-批量写入(_bulk)
一次请求里可混用 index / create / update / delete,用换行分隔。
POST /_bulk
{"index":{"_index":"shop","_id":"2001"}}
{"title":"华为 Mate 60","price":6499}
{"create":{"_index":"shop","_id":"2002"}}
{"title":"OPPO Find X6","price":4999}
{"update":{"_id":"1001","_index":"shop"}}
{"doc":{"stock":88}}
{"delete":{"_id":"2000","_index":"shop"}}
8-批量获取(_mget)
POST /shop/_mget
{"ids": ["1001","2001","9999"]
}
返回数组,顺序与请求一致,缺失的 id 会标记 “found”: false。
9-批量更新/删除(_update_by_query / _delete_by_query)
9.1-条件更新
POST /shop/_update_by_query
{"script": {"source": "ctx._source.price *= 0.9"},"query": {"range": { "price": { "gte": 5000 } }}
}
9.2-条件删除
POST /shop/_delete_by_query
{"query": { "term": { "title.keyword": "iPhone 15" } }
}
10-Reindex(跨索引复制/转换文档)
POST _reindex
{"source": {"index": "shop","query": { "range": { "price": { "gte": 4000 } } }},"dest": {"index": "shop_v2"}
}
11-乐观并发控制(_seq_no/_primary_term)
GET /shop/_doc/1001 → 得到 _seq_no=12、_primary_term=1
PUT /shop/_doc/1001?if_seq_no=12&if_primary_term=1
{ "price": 4799 }
若期间被其它线程修改,则返回 409,客户端需重试。
12-文档计数(_count)
GET /shop/_count
{ "query": { "match_all": {} } }
13-Term Vectors(查看分词、tf-idf 等)
GET /shop/_termvectors/1001
{
“fields”: [“title”],
“term_statistics”: true
}
14-解释查询(Explain)
GET /shop/_explain/1001
{"query": { "match": { "title": "小米" } }
}
返回该文档对指定查询的打分细节。
15-获取文档的源字段过滤(_source)
GET /shop/_source/1001/?_source=title,price&pretty=true
16-实时获取 vs 刷新可见性
• 默认 refresh_interval=1s,写入后最快 1s 可搜索。
• 立即可见:在写入 URL 后加 ?refresh=wait_for 或 ?refresh=true
POST /shop/_doc?refresh=wait_for
{“title”:“一加 12”}
17-版本号(_version)与外部版本
PUT /shop/_doc/3001?version=5&version_type=external
{“title”:“vivo X100”}
当外部系统版本 > 当前 _version 才成功。
18-文档路由(Routing)
自定义 routing key 使同 seller 的文档落在同一 shard:
PUT /shop/_doc/4001?routing=sellerA
{“title”:“Redmi Note 13”,“seller”:“sellerA”}
后续查询、更新、删除都必须带相同 routing= 参数。
小结
“文档”维度的日常操作可归纳为:
CRUD → 批量 → 条件更新/删除 → 复制/重建 → 并发控制 → 路由控制
把上面 18 个示例全部跑通,即可覆盖 99% 的业务场景。
5-query操作
下面把 Elasticsearch 查询(Query) 层面的核心能力全部梳理出来。
所有示例可直接粘到 Kibana Dev-Tools / curl,默认索引为 shop
,字段示例沿用之前文档(title:text
, price:double
, tags:keyword
, created:date
等)。如无特殊说明均使用 Query DSL(JSON 请求体)。
1. 基本查询结构
GET /shop/_search
{"query": {...}, // 查询条件"_source": [...], // 返回字段过滤"sort": [...], // 排序"from": 0, // 分页偏移"size": 10 // 每页条数
}
2. 查询全部 & 分页
{"query": { "match_all": {} },"from": 0,"size": 20
}
3. 全文检索
- match(分词后 OR)
{"query": { "match": { "title": "小米手机" } }
}
- match_phrase(短语/顺序匹配)
{"query": { "match_phrase": { "title": "小米 14" } }
}
- multi_match(多字段同一关键词)
{"query": {"multi_match": {"query": "小米","fields": ["title^2", "tags"] // ^2 提升权重}}
}
4. 精准匹配
- term / terms(keyword、数字、日期精确值)
{ "query": { "term": { "price": 4999 } } }
{ "query": { "terms": { "tags": ["小米","手机"] } } }
- range(区间)
{"query": {"range": {"price": { "gte": 3000, "lte": 6000 },"created": { "gte": "2024-08-01", "lte": "now/d" }}}
}
5. 布尔组合查询
{"query": {"bool": {"must": [ { "match": { "title": "小米" } } ],"filter": [ { "range": { "price": { "gte": 4000 } } } ],"should": [ { "term": { "tags": "新品" } } ],"must_not": [ { "term": { "tags": "下架" } } ]}}
}
must/filter
均参与打分;filter
仅做过滤,可利用缓存。
6. 存在 / 缺失
{ "query": { "exists": { "field": "stock" } } }
7. 前缀 / 通配符 / 正则
{ "query": { "prefix": { "title": "iPh" } } }
{ "query": { "wildcard": { "title": "iPhon*" } } }
{ "query": { "regexp": { "title": "iPhon.*" } } }
8. 模糊 / 拼写容错
{"query": {"fuzzy": {"title": {"value": "xiami","fuzziness": 2}}}
}
9. IDs 查询
{"query": { "ids": { "values": ["1001","1002"] } }
}
10. Nested 查询(嵌套对象)
假设 mapping 中有:
"comments": {"type": "nested","properties": {"user": { "type": "keyword" },"score": { "type": "byte" }}
}
查询 user=“alice” 且 score=5 的评论:
{"query": {"nested": {"path": "comments","query": {"bool": {"must": [{ "term": { "comments.user": "alice" } },{ "term": { "comments.score": 5 } }]}}}}
}
11. 高亮
GET /shop/_search
{"query": { "match": { "title": "小米" } },"highlight": {"fields": { "title": {} }}
}
返回 _highlight
字段包裹 <em>
标签。
12. 排序
- 普通字段排序
"sort": [{ "price": { "order": "desc" } },{ "_score": { "order": "desc" } }
]
- 距离排序(geo_point 示例)
"sort": [{"_geo_distance": {"location": { "lat": 31.2, "lon": 121.5 },"order": "asc","unit": "km"}}
]
13. 聚合查询(Aggregations)
- 指标聚合(max/min/avg/sum/stats)
GET /shop/_search
{"size": 0,"aggs": {"avg_price": { "avg": { "field": "price" } },"max_price": { "max": { "field": "price" } }}
}
- 桶聚合(terms 分组)
{"size": 0,"aggs": {"by_tag": {"terms": { "field": "tags.keyword", "size": 10 }}}
}
- Range 桶
{"size": 0,"aggs": {"price_ranges": {"range": {"field": "price","ranges": [{ "to": 3000 },{ "from": 3000, "to": 5000 },{ "from": 5000 }]}}}
}
- 嵌套聚合(先分组再求平均)
{"size": 0,"aggs": {"by_tag": {"terms": { "field": "tags.keyword" },"aggs": {"avg_price": { "avg": { "field": "price" } }}}}
}
14. 搜索模板(Search Template)
创建模板:
POST _scripts/price_template
{"script": {"lang": "mustache","source": {"query": {"range": {"price": {"gte": "{{min_price}}","lte": "{{max_price}}"}}}}}
}
使用模板:
GET /shop/_search/template
{"id": "price_template","params": { "min_price": 3000, "max_price": 6000 }
}
15. Scroll / Search After(深分页)
- Scroll(快照,适合离线导出)
# 1. 第一次请求
POST /shop/_search?scroll=2m
{"size": 100,"query": { "match_all": {} }
}
# 返回 _scroll_id,再轮询
POST _search/scroll
{"scroll": "2m","scroll_id": "DXF1ZXJ5QW5kRmV0Y2gB..."
}
- Search After(实时游标,推荐)
GET /shop/_search
{"size": 10,"sort": [{ "price": "desc" },{ "_id": "asc" }],"search_after": [ 4999, "1001" ]
}
16. Profile & Explain(调优)
GET /shop/_search
{"profile": true,"query": { "match": { "title": "小米" } }
}GET /shop/_explain/1001
{"query": { "match": { "title": "小米" } }
}
17. Count API(仅计数)
GET /shop/_count
{"query": { "range": { "price": { "gte": 4000 } } }
}
18. Validate API(验证语法)
GET /shop/_validate/query?explain=true
{"query": { "match": { "title": "小米" } }
}
19. Multi-Search(_msearch)
一次请求里并发跑多条查询:
GET _msearch
{"index":"shop"}
{"query":{"match":{"title":"小米"}},"size":2}
{"index":"shop"}
{"query":{"range":{"price":{"gte":4000}}},"size":3}
20. 查询结果过滤(字段、高亮、排序、分页)综合示例
GET /shop/_search
{"query": {"bool": {"must": { "multi_match": { "query": "手机", "fields": ["title","tags"] } },"filter": { "range": { "price": { "gte": 3000, "lte": 6000 } } }}},"_source": ["title","price","tags"],"sort": [{ "created": { "order": "desc" } },{ "_score": { "order": "desc" } }],"from": 0,"size": 20,"highlight": {"fields": { "title": {} }}
}
小结
场景 | 关键词/DSL | 备注 |
---|---|---|
全部/分页 | match_all + from/size | 简单查询 |
全文 | match , match_phrase , multi_match | 倒排索引 |
精准 | term , terms , range | keyword/数值/日期 |
组合 | bool (must/filter/should/must_not ) | 最常用 |
嵌套 | nested | 子文档 |
聚合 | aggs | 指标 + 桶 |
高亮 | highlight | 搜索结果 |
深分页 | scroll , search_after | 性能友好 |
把以上 20 组示例全部跑通,即可覆盖 99% 的 ES 查询需求。
6-Aggregations聚合操作
下面把 Elasticsearch 聚合(Aggregations) 的 10 组高频、可落地的核心操作全部梳理出来。
所有示例都在 Kibana Dev-Tools / curl 中可直接执行,索引沿用 shop
,字段示例:
title:text、price:double、tags:keyword、created:date、seller_id:keyword、location:geo_point、comments:nested。
1. 指标聚合(Metrics)
1.1 单值:AVG / MAX / MIN / SUM / VALUE_COUNT
GET /shop/_search
{"size": 0,"aggs": {"avg_price": { "avg": { "field": "price" } },"max_price": { "max": { "field": "price" } },"total_docs": { "value_count": { "field": "_id" } }}
}
1.2 多值:stats / extended_stats
{"size": 0,"aggs": {"price_stats": { "stats": { "field": "price" } }}
}
1.3 百分位 / 中位数
{"size": 0,"aggs": {"price_percentiles": {"percentiles": {"field": "price","percents": [50, 95, 99]}}}
}
2. 桶聚合(Buckets)
2.1 terms 分组(按 keyword 字段)
{"size": 0,"aggs": {"by_tag": {"terms": {"field": "tags.keyword","size": 10,"order": { "_count": "desc" }}}}
}
2.2 range 区间
{"size": 0,"aggs": {"price_ranges": {"range": {"field": "price","ranges": [{ "to": 2000, "key": "经济" },{ "from": 2000, "to": 5000, "key": "中档" },{ "from": 5000, "key": "高端" }]}}}
}
2.3 date_histogram 时间直方图
{"size": 0,"aggs": {"sales_per_day": {"date_histogram": {"field": "created","calendar_interval": "1d","format": "yyyy-MM-dd"}}}
}
2.4 geo_distance 距离桶
{"size": 0,"aggs": {"rings": {"geo_distance": {"field": "location","origin": { "lat": 31.2, "lon": 121.5 },"unit": "km","ranges": [{ "to": 5 },{ "from": 5, "to": 20 },{ "from": 20 }]}}}
}
3. 嵌套聚合(Pipeline Buckets)
3.1 在桶内再算指标
{"size": 0,"aggs": {"by_tag": {"terms": { "field": "tags.keyword" },"aggs": {"avg_price": { "avg": { "field": "price" } }}}}
}
3.2 Top Hits(每组取最新 N 条)
{"size": 0,"aggs": {"top_sellers": {"terms": { "field": "seller_id" },"aggs": {"latest": {"top_hits": {"size": 3,"sort": [ { "created": { "order": "desc" } } ],"_source": ["title","price"]}}}}}
}
4. Nested 聚合
4.1 先下钻 nested,再聚合
{"size": 0,"aggs": {"comments": {"nested": { "path": "comments" },"aggs": {"avg_score": { "avg": { "field": "comments.score" } },"by_user": {"terms": { "field": "comments.user.keyword" },"aggs": { "avg_score": { "avg": { "field": "comments.score" } } }}}}}
}
5. 管道聚合(Pipeline)
5.1 bucket_script:两桶相除算占比
{"size": 0,"aggs": {"by_tag": {"terms": { "field": "tags.keyword" },"aggs": {"total_sales": { "sum": { "field": "price" } },"total_cnt": { "value_count": { "field": "_id" } },"avg_price": {"bucket_script": {"buckets_path": {"sales": "total_sales","cnt": "total_cnt"},"script": "params.sales / params.cnt"}}}}}
}
6. 采样聚合(Sampler / Diversified Sampler)
{"size": 0,"aggs": {"sample": {"diversified_sampler": {"shard_size": 200,"field": "seller_id"},"aggs": {"top_tags": { "terms": { "field": "tags.keyword" } }}}}
}
7. 统计聚合 Cardinality(去重计数)
{"size": 0,"aggs": {"unique_sellers": { "cardinality": { "field": "seller_id" } }}
}
8. Matrix 统计(数值字段相关性)
{"size": 0,"aggs": {"correlations": {"matrix_stats": {"fields": ["price", "stock"]}}}
}
9. 聚合排序 & 分页
9.1 按子聚合结果排序
{"size": 0,"aggs": {"by_seller": {"terms": {"field": "seller_id","order": { "avg_price": "desc" }},"aggs": {"avg_price": { "avg": { "field": "price" } }}}}
}
9.2 Composite 聚合(流式全量桶)
{"size": 0,"aggs": {"seller_date": {"composite": {"size": 1000,"sources": [{ "seller": { "terms": { "field": "seller_id" } } },{ "day": { "date_histogram": { "field": "created", "calendar_interval": "1d" } } }]}}}
}
10. 聚合结果后过滤(post_filter)
{"size": 0,"query": { "range": { "price": { "gte": 1000 } } },"aggs": {"by_tag": { "terms": { "field": "tags.keyword" } }},"post_filter": { "term": { "seller_id": "sellerA" } }
}
post_filter
只影响返回文档,不影响聚合范围。
小结(一句话背下来)
类别 | 关键字 | 典型场景 |
---|---|---|
指标 | avg/max/min/stats/percentiles/cardinality | 平均值、最大、去重计数 |
桶 | terms , range , date_histogram , geo_distance | 分组、区间、时间直方图、距离圈 |
嵌套 | "aggs": { ... } 里再放子聚合 | 桶内再算指标、Top Hits |
管道 | bucket_script , moving_avg , cumulative_sum | 桶之间二次运算 |
特殊 | nested , composite , sampler | 嵌套对象、流式桶、采样加速 |
把上面 10 类示例全部跑通,即可覆盖 99% 的 ES 聚合需求。