当前位置：首页 > news >正文

Elasticsearch 2.x版本升级指南

news 2025/9/18 11:21:21

前言

作为一名在搜索引擎领域深耕多年的技术专家，我见证了 Elasticsearch 从早期版本到现在的蓬勃发展。许多企业仍在使用 Elasticsearch 2.x 版本，但随着技术的发展和安全性要求的提升，升级到更新版本已成为必然趋势。今天，我将基于实际项目经验，为大家详细介绍如何将 Elasticsearch 2.x 平滑升级到 6.x 或 7.x 版本，帮助大家避开升级过程中的各种陷阱。

为什么要升级？

在开始技术细节之前，让我们先明确升级的必要性：

🚀 性能提升

查询性能：7.x 版本相比 2.x 查询性能提升 2-3 倍
索引速度：优化的索引机制，写入性能提升 30-50%
内存使用：更高效的内存管理，减少 GC 压力

🔒 安全性增强

内置安全功能：6.x 开始集成 X-Pack 安全模块
加密通信：支持 TLS/SSL 加密
用户权限管理：细粒度的访问控制

🛠️ 功能丰富

机器学习：异常检测和预测分析
SQL 支持：可以使用 SQL 语法查询
跨集群复制：数据同步和灾备

升级路径规划

版本升级路径

Elasticsearch 有一个重要原则：不能跨大版本直接升级。从 2.x 升级到 6.x/7.x 需要分步进行：

2.x → 5.6.x → 6.8.x → 7.17.x

升级版本选择

对于ES 2.x版本集群升级，可以升级到ES 6.x版本。如果能够升级到ES 7.x版本更好。

升级至 ES 6.X

为什么推荐升级 ES 6.X 版本？

ES 6.x 版本最早发布于2017年，是目前市场上的主流版本，经受过市场考验，功能齐全，性能稳定可靠。

升级至 ES 7.X

为什么推荐升级 ES 7.X 版本？

ES 7.x版本最早发布于2019年，同样是目前市场上的主流版本，搜索引擎升级到Lucene 8.x。相对ES6.x版本从内存管理、性能上都有很大的提升。

ES2.x 升级到 ES7.x 有什么困难？

从ES7.x版本开始，ES单个索引不再支持多Type，对于原ES 2.x版本中多Type的索引不兼容，可能意味着要拆分索引或索引重构，成本会很高。

版本之间差异

升级 ES 6.X

这里仅列举关键的差异，完整的差异信息参考官方文档。

官方文档：https://www.elastic.co/guide/en/elasticsearch/reference/6.8/breaking-changes-6.0.html

客户端差异

使用官方原生客户端Java High Level REST Client。

<!-- Elasticsearch REST高级客户端依赖 -->
<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId><version>6.8.23</version>
</dependency>

Mapping差异

ES6.x版本不再支持多type，单个索引仅支持一个type。迁移过来的索引如果是多Type结构，仍然可以继续使用。但新建的索引只能是单Type。
ES 6.x版本创建索引时可以不指定Type（默认type=_doc），也可以自定义指定Type，但一个索引只能有一个Type。
不再支持字段类型”string“，拆分成text（对应原分词字段）和keyword（对应原不分词字段）两个类型。

2.x	6.x	说明
`"fields": {"type": "string"}`	拆分为`text`和`keyword`	5.0+ 版本开始拆分
-	`"fields": {"type": "text"}`	分词类型，适合全文搜索，不支持聚合和排序
-	`"fields": {"type": "keyword"}`	不分词类型，适合精准匹配，支持聚合和排序

不再支持“_all”，可以改用“copy_to”。
Boolean类型字段值仅支持“true”、“false”，不再支持on, off, yes, no, 0, 1。迁移过来的旧版本索引可以继续使用。新建的索引需遵循此规则。
移除了_timestamp字段。
对于没有range查询需求的数值类型字段（Integer，double等），统一使用keyword类型。否则有可能导致集群风险。（因为数值类型底层索引结构有变化，不再适合term查询）

查询语法差异

对于Boolean类型字段的查询，仅支持“true”、“false”。
in被删除，可改用terms。
bool查询中，minimum_number_should_match不再支持，改为了minimum_should_match。
ES 2.x 的 filtered 查询在 ES 6.x 中需用 bool 查询替代：

# 2.x
GET /my_index/user/_search
{"query": {"filtered": {"filter": { "term": { "status": "active" } },"query": { "match": { "title": "Elasticsearch" } }}}
}# 6.x
GET /my_index/_search
{"query": {"bool": {"filter": { "term": { "status": "active" } },"must": { "match": { "title": "Elasticsearch" } }}}
}

Scroll查询
- 移除 search_type=scroll，直接使用 scroll 参数，不再指定 search_type。
- scroll_id可能会比较长，建议用POST请求而不是GET。
关于Type
- 查询指定Type，ES 6.8.23 版本兼容带Type查询语法。需要注意的是需要include_type_name=true，不过在ES 6.x大版本中include_type_name默认就是true，所以也不需要额外指定。
- 查询不指定type，会查当前索引下的全部Type（可能是一个或多个）。

聚合差异

date_histogram聚合

# 2.x
{"aggs": {"my_date_histogram": {"date_histogram": {"field": "timestamp","interval": "day"  // 仅支持基础单位，固定按24小时计算}}}
}# 6.x
{"aggregations": {"my_date_histogram": {"date_histogram": {"field": "timestamp","calendar_interval": "day","format": "yyyy-MM-dd",  // 控制 key_as_string 的格式"time_zone": "Asia/Shanghai"}}}
}
# 2.x
{"aggregations": {"my_date_histogram": {"buckets": [{"key": 1420070400000,  "doc_count": 100},{"key": 1420156800000,"doc_count": 80}]}}
}# 6.x
{"aggregations": {  // 固定为aggregations"my_date_histogram": {"buckets": [{"key": 1672531200000,  // 时间戳（考虑时区后）"key_as_string": "2023-01-01",  // 新增：格式化字符串"doc_count": 50}]}}
}

terms聚合

# 2.x
{"aggregations": {"top_categories": {"buckets": [{"key": "books", "doc_count": 100},{"key": "electronics", "doc_count": 80}]}}
}# 6.x
{"aggregations": {"top_categories": {"doc_count_error_upper_bound": 5,  // 表示分桶计数的最大可能误差（因分片并行计算导致）"sum_other_doc_count": 45,  //表示未返回的桶的总文档数（因 size 限制被截断的部分）；"buckets": [{"key": "books", "doc_count": 100,"doc_count_error_upper_bound": 2}]}}
}

索引模版差异

# 2.x 
{"template": "logstash-*","mappings": {"logs": {"properties": {"@timestamp": {"type": "date"}}}}
}# 6.x 
{"index_patterns": ["logstash-*"],"mappings": {"_doc": {"properties": {"@timestamp": {"type": "date"}}}}
}

索引配置差异

ES 6.x版本取消了index前缀

# 2.x 
{"settings": {"index.number_of_shards": 5,"index.number_of_replicas": 1}
}# 6.x 
{"settings": {"number_of_shards": 5,"number_of_replicas": 1}
}

查询返回结果差异

# 2.x
{"took": 15,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 25,"max_score": 1.0,"hits": [{"_index": "my_index","_type": "my_type","_id": "1","_score": 1.0,"_source": {"title": "Elasticsearch guide","content": "This is a guide"}}]}
}# 6.2
{"took": 15,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": {"value": 25,"relation": "eq"},"max_score": 1.0,"hits": [{"_index": "my_index","_type": "_doc",     // type固定为_doc"_id": "1","_score": 1.0,"_source": {"title": "Elasticsearch guide","content": "This is a guide"}}]}
}

升级 ES 7.10.2

升级ES 7.x在包含上述ES 6.x差异的基础之上，还包含如下差异：

客户端差异

ES 7.x开始仅支持RestClient，不再支持TransportClient。

<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-high-level-client -->
<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId><version>7.10.2</version>
</dependency>

Mapping差异

ES 7.x开始接口不再支持type，但底层type结构还在，只是默认写死为"_doc"。
默认分片数由 5 改成 1。

查询参数差异

ES 7.x版本中include_type_name参数默认为false，如果请求中包含type，则需要指定include_type_name=true

返回值total差异

从ES 7.x版本开始，不在精确返回命中文档total（总数）。此举是为了优化性能。同时引入了track_total_hits参数，用于需要返回命中总数的场景：
- track_total_hits默认值为false：当匹配文档数超过 10,000 时，返回近似值（total.value = 10000且relation: "gte"）。
- 若设置为true：强制返回精确的总命中数（relation: "eq"），但可能增加查询耗时。