当前位置: 首页 > news >正文

Elasticsearch 快速入门指南

1. Elasticsearch 简介

Elasticsearch 是一个基于 Lucene 的开源分布式搜索和分析引擎,由 Elastic 公司开发。它具有以下特点:

  • 分布式:可以轻松扩展到数百台服务器,处理 PB 级数据
  • 实时性:数据一旦被索引,立即可被搜索
  • 全文检索:强大的全文搜索能力
  • RESTful API:提供简单易用的 JSON 风格 API
  • 多功能:不仅是搜索引擎,还是强大的分析引擎

2. 核心概念

在深入 Elasticsearch 之前,我们需要理解几个基本概念:

Elasticsearch关系型数据库
索引 (Index)数据库 (Database)
类型 (Type)表 (Table)
文档 (Document)行 (Row)
字段 (Field)列 (Column)
映射 (Mapping)表结构 (Schema)
分片 (Shard)数据分区
副本 (Replica)数据备份

3. 安装与设置

安装 Elasticsearch

# 下载 Elasticsearch(以 7.x 版本为例)
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.0-linux-x86_64.tar.gz# 解压
tar -xzf elasticsearch-7.17.0-linux-x86_64.tar.gz# 启动
cd elasticsearch-7.17.0/
./bin/elasticsearch

验证安装成功:

curl http://localhost:9200/

输出结果:

{"name" : "node-1","cluster_name" : "elasticsearch","cluster_uuid" : "xyzABCdefGHI123456","version" : {"number" : "7.17.0","build_flavor" : "default","build_type" : "tar","build_hash" : "abcd1234","build_date" : "2022-01-01T12:34:56.789Z","build_snapshot" : false,"lucene_version" : "8.11.1","minimum_wire_compatibility_version" : "6.8.0","minimum_index_compatibility_version" : "6.0.0-beta1"},"tagline" : "You Know, for Search"
}

4. 基本操作 (CRUD)

Elasticsearch 提供了 RESTful API 进行各种操作,常用的 HTTP 方法如下:

  • GET:获取资源
  • POST:创建资源
  • PUT:创建或更新资源
  • DELETE:删除资源
  • HEAD:检查资源是否存在

4.1 创建索引

# 创建索引语法
PUT /索引名称
{"settings": {"number_of_shards": 分片数,"number_of_replicas": 副本数}
}

例子:

PUT /blog
{"settings": {"number_of_shards": 3,"number_of_replicas": 1}
}

响应:

{"acknowledged": true,"shards_acknowledged": true,"index": "blog"
}

4.2 添加文档

# 添加文档语法 - 指定ID
PUT /索引名称/_doc/文档ID
{"字段1": "值1","字段2": "值2",...
}# 添加文档语法 - 自动生成ID
POST /索引名称/_doc
{"字段1": "值1","字段2": "值2",...
}

例子:

PUT /blog/_doc/1
{"title": "Elasticsearch入门","author": "张三","content": "这是一篇关于Elasticsearch的入门文章","tags": ["搜索引擎", "Elasticsearch"],"created_at": "2023-01-01T10:00:00"
}

响应:

{"_index": "blog","_type": "_doc","_id": "1","_version": 1,"result": "created","_shards": {"total": 2,"successful": 2,"failed": 0},"_seq_no": 0,"_primary_term": 1
}

4.3 查询文档

# 查询文档语法 - 按ID查询
GET /索引名称/_doc/文档ID# 查询所有文档
GET /索引名称/_search

例子:

# 按ID查询
GET /blog/_doc/1

响应:

{"_index": "blog","_type": "_doc","_id": "1","_version": 1,"_seq_no": 0,"_primary_term": 1,"found": true,"_source": {"title": "Elasticsearch入门","author": "张三","content": "这是一篇关于Elasticsearch的入门文章","tags": ["搜索引擎", "Elasticsearch"],"created_at": "2023-01-01T10:00:00"}
}

4.4 更新文档

# 更新文档语法
POST /索引名称/_update/文档ID
{"doc": {"字段1": "新值1","字段2": "新值2"}
}

例子:

POST /blog/_update/1
{"doc": {"title": "Elasticsearch快速入门","tags": ["搜索引擎", "Elasticsearch", "教程"]}
}

响应:

{"_index": "blog","_type": "_doc","_id": "1","_version": 2,"result": "updated","_shards": {"total": 2,"successful": 2,"failed": 0},"_seq_no": 1,"_primary_term": 1
}

4.5 删除文档

# 删除文档语法
DELETE /索引名称/_doc/文档ID

例子:

DELETE /blog/_doc/1

响应:

{"_index": "blog","_type": "_doc","_id": "1","_version": 3,"result": "deleted","_shards": {"total": 2,"successful": 2,"failed": 0},"_seq_no": 2,"_primary_term": 1
}

4.6 删除索引

# 删除索引语法
DELETE /索引名称

例子:

DELETE /blog

响应:

{"acknowledged": true
}

5. 搜索功能

Elasticsearch 的核心功能是搜索,它提供了丰富的查询功能。

5.1 基本查询

# 查询语法
GET /索引名称/_search
{"query": {"查询类型": {"参数": "值"}}
}

例子:

# 查询标题中包含"Elasticsearch"的文档
GET /blog/_search
{"query": {"match": {"title": "Elasticsearch"}}
}

响应:

{"took": 5,"timed_out": false,"_shards": {"total": 3,"successful": 3,"skipped": 0,"failed": 0},"hits": {"total": {"value": 2,"relation": "eq"},"max_score": 0.6931472,"hits": [{"_index": "blog","_type": "_doc","_id": "1","_score": 0.6931472,"_source": {"title": "Elasticsearch快速入门","author": "张三","content": "这是一篇关于Elasticsearch的入门文章","tags": ["搜索引擎", "Elasticsearch", "教程"],"created_at": "2023-01-01T10:00:00"}},{"_index": "blog","_type": "_doc","_id": "2","_score": 0.5753642,"_source": {"title": "深入理解Elasticsearch","author": "李四","content": "本文详细介绍Elasticsearch的内部原理","tags": ["Elasticsearch", "原理"],"created_at": "2023-01-02T15:30:00"}}]}
}

5.2 布尔查询

GET /索引名称/_search
{"query": {"bool": {"must": [{ "match": { "字段1": "值1" } }],"should": [{ "match": { "字段2": "值2" } }],"must_not": [{ "match": { "字段3": "值3" } }],"filter": [{ "term": { "字段4": "值4" } }]}}
}

例子:

# 查询标题包含"Elasticsearch"且作者不是"王五"的文档
GET /blog/_search
{"query": {"bool": {"must": [{ "match": { "title": "Elasticsearch" } }],"must_not": [{ "match": { "author": "王五" } }]}}
}

5.3 查询结果排序

GET /索引名称/_search
{"query": {"match_all": {}},"sort": [{ "字段1": { "order": "desc" } },{ "字段2": { "order": "asc" } }]
}

例子:

# 按创建时间排序查询
GET /blog/_search
{"query": {"match_all": {}},"sort": [{ "created_at": { "order": "desc" } }]
}

5.4 分页查询

GET /索引名称/_search
{"from": 起始位置,"size": 返回数量,"query": {"match_all": {}}
}

例子:

# 分页查询,返回第2页的10条数据
GET /blog/_search
{"from": 10,"size": 10,"query": {"match_all": {}}
}

5.5 聚合查询

GET /索引名称/_search
{"size": 0,"aggs": {"聚合名称": {"聚合类型": {"field": "字段名"}}}
}

例子:

# 获取作者发文数量统计
GET /blog/_search
{"size": 0,"aggs": {"authors": {"terms": {"field": "author.keyword","size": 10}}}
}

响应:

{"took": 10,"timed_out": false,"_shards": {"total": 3,"successful": 3,"skipped": 0,"failed": 0},"hits": {"total": {"value": 10,"relation": "eq"},"max_score": null,"hits": []},"aggregations": {"authors": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "张三","doc_count": 3},{"key": "李四","doc_count": 2},{"key": "王五","doc_count": 1}]}}
}

6. 实际应用场景

6.1 网站搜索

很多网站的搜索功能都是基于 Elasticsearch 实现的。用户可以通过关键词快速找到相关内容,同时支持高亮显示、搜索建议、拼写纠错等功能。

示例场景:电商网站商品搜索

# 创建商品索引
PUT /products
{"mappings": {"properties": {"name": { "type": "text", "analyzer": "ik_max_word" },"description": { "type": "text", "analyzer": "ik_max_word" },"price": { "type": "float" },"category": { "type": "keyword" },"tags": { "type": "keyword" },"stock": { "type": "integer" },"created_at": { "type": "date" }}}
}# 搜索名称或描述中包含"手机"的商品,按价格降序排列
GET /products/_search
{"query": {"multi_match": {"query": "手机","fields": ["name", "description"]}},"sort": [{ "price": { "order": "desc" } }]
}

6.2 日志分析

Elasticsearch 是 ELK 栈(Elasticsearch、Logstash、Kibana)的核心组件,广泛应用于日志收集和分析。

示例场景:Web服务器日志分析

# 查询特定时间范围内的错误日志
GET /logs/_search
{"query": {"bool": {"must": [{ "match": { "level": "ERROR" } }],"filter": [{"range": {"timestamp": {"gte": "2023-01-01T00:00:00","lte": "2023-01-31T23:59:59"}}}]}},"sort": [{ "timestamp": { "order": "desc" } }]
}

6.3 数据可视化

结合 Kibana,可以将 Elasticsearch 中的数据进行可视化展示,如仪表盘、折线图、饼图等。

示例场景:业务监控仪表盘

# 按小时统计API请求量
GET /api_logs/_search
{"size": 0,"aggs": {"requests_per_hour": {"date_histogram": {"field": "timestamp","calendar_interval": "hour"}}}
}

6.4 实时分析

Elasticsearch 支持实时数据分析,可以用于实时监控和报警系统。

示例场景:异常监控

# 监控最近5分钟内的异常请求
GET /system_logs/_search
{"query": {"bool": {"must": [{ "match": { "status": "error" } }],"filter": [{"range": {"timestamp": {"gte": "now-5m","lte": "now"}}}]}}
}

7. 高级功能

7.1 映射(Mapping)

映射是定义文档及其字段如何存储和索引的过程。

# 创建带映射的索引
PUT /users
{"mappings": {"properties": {"username": { "type": "keyword" },"email": { "type": "keyword" },"bio": { "type": "text" },"age": { "type": "integer" },"join_date": { "type": "date" },"location": { "type": "geo_point" }}}
}

7.2 分析器(Analyzer)

分析器用于处理文本字段,包括分词、过滤等操作。

# 创建自定义分析器
PUT /my_index
{"settings": {"analysis": {"analyzer": {"my_custom_analyzer": {"type": "custom","tokenizer": "standard","filter": ["lowercase", "asciifolding"]}}}},"mappings": {"properties": {"title": {"type": "text","analyzer": "my_custom_analyzer"}}}
}

7.3 集群管理

查看集群健康状态:

GET /_cluster/health

响应:

{"cluster_name": "elasticsearch","status": "green","timed_out": false,"number_of_nodes": 3,"number_of_data_nodes": 3,"active_primary_shards": 15,"active_shards": 30,"relocating_shards": 0,"initializing_shards": 0,"unassigned_shards": 0,"delayed_unassigned_shards": 0,"number_of_pending_tasks": 0,"number_of_in_flight_fetch": 0,"task_max_waiting_in_queue_millis": 0,"active_shards_percent_as_number": 100.0
}

8. 总结

Elasticsearch 是一个功能强大的搜索和分析引擎,具有以下优势:

  1. 强大的搜索能力:支持全文搜索、结构化搜索、复杂查询等
  2. 实时分析:数据一旦索引立即可被搜索和分析
  3. 分布式架构:易于水平扩展,支持高可用
  4. RESTful API:简单易用的接口
  5. 丰富的生态系统:与 Logstash、Kibana、Beats 等工具集成形成完整解决方案

本指南涵盖了 Elasticsearch 的基本概念和操作,包括索引管理、文档CRUD、各种查询方式以及实际应用场景。通过这些基础知识,你可以开始在项目中使用 Elasticsearch 来实现强大的搜索和分析功能。

随着对 Elasticsearch 的深入学习,你还可以探索更多高级功能,如聚合分析、地理位置搜索、机器学习等,以满足更复杂的业务需求。

相关文章:

  • ChromaDB 向量库优化技巧实战
  • SymPy | 使用SymPy求解多元非线性方程组
  • 合并两个有序数组的高效算法详解
  • 1.1 认识编程与C++
  • 黑马k8s(七)
  • 腾讯开源实时语音大模型VITA-audio,92mstoken极速响应,支持多语言~
  • 麒麟v10 部署 MySQL 5.6.10 完整步骤
  • javaSE.迭代器
  • AI Agent开发第67课-彻底消除RAG知识库幻觉-文档分块全技巧(1)
  • 密码学刷题小记录
  • QML学习01(设置宽度、高度、坐标点、标题,信号与槽,键盘事件)
  • 网页渲染的两条赛道
  • 【高斯拟合】不用库手写高斯拟合算法:从最小二乘到拟合参数推导
  • 牛客网NC22012:判断闰年问题详解
  • [c语言日寄]数据结构:栈
  • RAGFlow 中的 Rerank 和 Recall 解释
  • 大数据架构选型全景指南:核心架构对比与实战案例 解析
  • 吊舱热敏传感器抗干扰技术分析!
  • mysqlbinlog用法详解
  • AI数字人融合VR全景:从技术突破到可信场景落地
  • 终于,俄罗斯和乌克兰谈上了
  • 临港新片区将新设5亿元启航基金:专门投向在临港发展的种子期、初创型企业
  • 美叙领导人25年来首次会面探索关系正常化,特朗普下令解除对叙经济制裁
  • 多条跨境铁路加速推进,谁是下一个“超级枢纽”?
  • 《蛮好的人生》:为啥人人都爱这个不完美的“大女主”
  • 中华人民共和国和巴西联邦共和国关于强化携手构建更公正世界和更可持续星球的中巴命运共同体,共同维护多边主义的联合声明