当前位置：首页 > news >正文

第四十二篇：MySQL索引深入：B+Tree原理、最左前缀原则、索引优化

news 2025/11/16 12:28:03

引言：从索引使用者到索引设计师的蜕变

在数据库性能优化的世界里，索引无疑是最高效的利器。然而，很多开发者对索引的理解停留在"创建索引就能加快查询"的表面层面，当面对复杂的查询性能问题时往往束手无策。

真正的数据库专家，必须深入理解索引的底层原理和工作机制。只有掌握了B+Tree的内部结构、最左前缀原则的精髓，以及索引优化的系统方法，才能在设计高并发、大数据量的系统时游刃有余。

在本篇中，我们将深入探索MySQL索引的核心原理，通过丰富的图示、底层原理分析和实战案例，帮助你彻底掌握索引的方方面面，从被动的索引使用者成长为主动的索引设计师。

第一部分：B+Tree索引深度解析

1.1 为什么是B+Tree？—— 索引演进史

在理解B+Tree之前，让我们先看看为什么数据库选择B+Tree作为索引数据结构：

各种数据结构的对比

-- 思考：为什么不用其他数据结构？
-- 哈希表：快速O(1)查找，但不支持范围查询
-- 二叉搜索树：可能退化为链表，O(n)复杂度
-- 平衡二叉树：树太高，磁盘I/O次数多
-- B树：非叶子节点也存数据，节点能容纳的键值较少

B+Tree的独特优势：

矮胖树形结构：减少磁盘I/O次数
叶子节点链表：支持高效范围查询
高扇出性：单个节点存储更多键值
数据有序性：支持排序和范围扫描

1.2 B+Tree核心结构详解

B+Tree层次结构

┌─────────────────────────────────────────────────────────────┐
│                      B+Tree 结构示意图                       │
├─────────────────────────────────────────────────────────────┤
│                        根节点 (Root)                         │
│                 ┌─────────┬─────────┬─────────┐             │
│                 │ 指针0   │ 指针1   │ 指针2   │              │
│                 │  键值1  │  键值2  │   ...   │              │
│                 └─────────┴─────────┴─────────┘             │
├─────────────────────────────────────────────────────────────┤
│                       内部节点 (Internal)                    │
│         ┌─────────┐         ┌─────────┐         ┌─────────┐ │
│         │ 指针0   │         │ 指针1   │         │ 指针2   │  │
│         │  键值1  │         │  键值2  │         │   ...   │  │
│         └─────────┘         └─────────┘         └─────────┘ │
├─────────────────────────────────────────────────────────────┤
│                       叶子节点 (Leaf)                        │
│  ┌─────────┬─────────┐  ┌─────────┬─────────┐  ┌─────────┬───┐
│  │  键值1  │ 数据指针 │  │  键值2  │ 数据指针 │  │  键值3  │...│
│  └─────────┴─────────┘  └─────────┴─────────┘  └─────────┴───┘
│        ↓         ↓             ↓         ↓           ↓
│      [数据]    [数据]         [数据]    [数据]       [数据]
└─────────────────────────────────────────────────────────────┘

B+Tree节点内部结构

内部节点结构：

┌─────────────────────────────────────────────────────────────┐
│                    内部节点 (Internal Node)                  │
├───────┬─────────┬───────┬─────────┬───────┬─────────┬───────┤
│ P0    │ Key1    │ P1    │ Key2    │ P2    │ Key3    │ P3    │
├───────┴─────────┴───────┴─────────┴───────┴─────────┴───────┤
│ 指向子树的指针   分隔键值      指针       分隔键值     指针    │
└─────────────────────────────────────────────────────────────┘

叶子节点结构：

┌─────────────────────────────────────────────────────────────┐
│                     叶子节点 (Leaf Node)                     │
├─────────┬─────────┬─────────┬─────────┬─────────┬───────────┤
│ Key1    │ DataPtr │ Key2    │ DataPtr │ Key3    │ DataPtr   │
├─────────┴─────────┴─────────┴─────────┴─────────┴───────────┤
│   键值      数据指针    键值      数据指针    键值     数据指针│
└─────────────────────────────────────────────────────────────┘

1.3 B+Tree操作原理深度剖析

查找过程（Search）

# B+Tree查找算法伪代码
def bplus_tree_search(node, key):if node.is_leaf:# 在叶子节点中二分查找return binary_search(node.keys, key)else:# 在内部节点中找到合适的子树child_index = find_child_index(node.keys, key)child_node = get_child_node(node, child_index)return bplus_tree_search(child_node, key)def find_child_index(keys, target_key):"""在有序键值数组中找到目标键应该插入的位置"""left, right = 0, len(keys)while left < right:mid = (left + right) // 2if target_key < keys[mid]:right = midelse:left = mid + 1return left

查找过程示例：
假设在包含[10, 20, 30, 40]的B+Tree中查找25：

根节点比较：25在20和30之间
进入对应的子树
在叶子节点中找到25或确认不存在

插入过程（Insertion）

# B+Tree插入算法伪代码
def bplus_tree_insert(root, key, value):# 1. 找到插入位置所在的叶子节点leaf = find_leaf(root, key)# 2. 如果叶子节点有空间，直接插入if leaf.has_room():leaf.insert(key, value)else:# 3. 叶子节点分裂new_leaf = leaf.split()# 4. 更新父节点if leaf.is_root():create_new_root(leaf, new_leaf)else:insert_into_parent(leaf.parent, new_leaf.first_key(), new_leaf)return rootdef leaf_split(leaf_node):"""叶子节点分裂"""mid_index = len(leaf_node.keys) // 2new_leaf = LeafNode()# 分裂键值和数据new_leaf.keys = leaf_node.keys[mid_index:]new_leaf.data = leaf_node.data[mid_index:]leaf_node.keys = leaf_node.keys[:mid_index]leaf_node.data = leaf_node.data[:mid_index]# 更新叶子节点链表new_leaf.next = leaf_node.nextleaf_node.next = new_leafreturn new_leaf

插入过程图示：

插入前：
叶子节点: [10, 20, 30, 40] (假设最大容量为4)插入35：
叶子节点满 → 分裂
新叶子节点: [30, 35, 40]
原叶子节点: [10, 20]更新父节点，添加指向新叶子的指针

删除过程（Deletion）

# B+Tree删除算法伪代码
def bplus_tree_delete(root, key):# 1. 找到包含键的叶子节点leaf = find_leaf(root, key)# 2. 从叶子节点删除键值if not leaf.delete(key):return root  # 键不存在# 3. 检查是否需要合并或重新分配if leaf.is_underflow() and not leaf.is_root():# 尝试从兄弟节点借元素if can_borrow_from_sibling(leaf):redistribute_from_sibling(leaf)else:# 合并节点merge_with_sibling(leaf)return rootdef merge_with_sibling(node):"""合并节点"""sibling = get_left_sibling(node) or get_right_sibling(node)if can_merge(node, sibling):# 合并两个节点if node.is_leaf:merged_keys = node.keys + sibling.keysmerged_data = node.data + sibling.dataelse:merged_keys = sibling.keys + [get_separator_key()] + node.keysmerged_children = sibling.children + node.children# 更新父节点指针update_parent_after_merge(node, sibling)

1.4 InnoDB中的B+Tree实现

聚簇索引（Clustered Index）

-- 聚簇索引示例
CREATE TABLE users (id INT AUTO_INCREMENT PRIMARY KEY,  -- 聚簇索引键name VARCHAR(100),email VARCHAR(100),created_at TIMESTAMP
) ENGINE=InnoDB;-- 聚簇索引的B+Tree结构：
-- 叶子节点包含完整的行数据
-- 数据按主键顺序物理存储

聚簇索引特点：

数据即索引，索引即数据
每个表只能有一个聚簇索引
主键自动成为聚簇索引
叶子节点包含完整行数据

二级索引（Secondary Index）

-- 二级索引示例
CREATE INDEX idx_email ON users(email);-- 二级索引的B+Tree结构：
-- 叶子节点包含索引列值 + 主键值
-- 查询需要回表操作

二级索引查询流程：

查询: SELECT * FROM users WHERE email = 'test@example.com'1. 在idx_email索引树中查找'test@example.com'
2. 找到对应的主键值(比如id=123)
3. 用主键值在聚簇索引中查找完整行数据
4. 返回完整行数据

第二部分：最左前缀原则深度解析

2.1 复合索引的内部结构

复合索引的键值构造

-- 创建复合索引
CREATE INDEX idx_composite ON employees(department, salary, hire_date);-- 索引键值的内部表示：
-- 实际的索引键是: (department, salary, hire_date) 的组合
-- 类似于: ('IT', 5000, '2020-01-15'), ('IT', 6000, '2019-03-20'), ...

复合索引的B+Tree结构：

索引键排序规则：
1. 先按department排序
2. department相同时，按salary排序  
3. department和salary都相同时，按hire_date排序

2.2 最左前缀原则详解

什么是"最左前缀"？

最左前缀原则是指：查询条件必须从复合索引的最左列开始，并且连续不能跳过中间列，才能充分利用索引。

-- 索引: (col1, col2, col3)-- ✅ 能使用索引的查询：
SELECT * FROM table WHERE col1 = 'A';
SELECT * FROM table WHERE col1 = 'A' AND col2 = 'B';
SELECT * FROM table WHERE col1 = 'A' AND col2 = 'B' AND col3 = 'C';
SELECT * FROM table WHERE col1 = 'A' AND col3 = 'C';  -- 只能用到col1-- ❌ 不能使用索引的查询：
SELECT * FROM table WHERE col2 = 'B';                    -- 缺少col1
SELECT * FROM table WHERE col2 = 'B' AND col3 = 'C';     -- 缺少col1
SELECT * FROM table WHERE col1 = 'A' AND col3 = 'C';     -- 跳过了col2

最左前缀原则的原理分析

为什么必须从最左开始？

复合索引: (A, B, C) 的B+Tree结构：根节点: [A1, A2, A3...]↓     ↓     ↓
内部节点: [B1,B2..] [B3,B4..] [B5,B6..]↓    ↓     ↓    ↓     ↓    ↓
叶子节点: [C1][C2] [C3][C4] [C5][C6] [C7][C8]查询 WHERE B = 'value'：
- 不知道从哪个A的子树开始查找
- 需要扫描所有子树，相当于全表扫描

2.3 最左前缀原则的实战应用

案例1：电商订单查询

-- 订单表复合索引设计
CREATE INDEX idx_order_query ON orders(user_id, status, created_at);-- ✅ 有效查询：
SELECT * FROM orders WHERE user_id = 100;  -- 使用索引
SELECT * FROM orders WHERE user_id = 100 AND status = 'paid';  -- 使用索引
SELECT * FROM orders WHERE user_id = 100 AND status = 'paid' AND created_at > '2023-01-01';  -- 使用索引
SELECT * FROM orders WHERE user_id = 100 AND created_at > '2023-01-01';  -- 部分使用索引(user_id)-- ❌ 无效查询：
SELECT * FROM orders WHERE status = 'paid';  -- 无法使用索引
SELECT * FROM orders WHERE created_at > '2023-01-01';  -- 无法使用索引

案例2：范围查询的影响

-- 索引: (A, B, C)-- ✅ 范围查询在最后一列：
SELECT * FROM table WHERE A = 1 AND B = 2 AND C > 100;  -- 完全使用索引-- ⚠️ 范围查询在中间列：
SELECT * FROM table WHERE A = 1 AND B > 100 AND C = 2;  
-- 只能使用(A, B)部分索引，C条件在索引中过滤-- ❌ 范围查询在第一列：
SELECT * FROM table WHERE A > 100 AND B = 2 AND C = 3;
-- 只能使用A部分索引，B和C无法使用索引查找

2.4 索引下推优化（Index Condition Pushdown）

ICP原理

索引下推是MySQL 5.6引入的重要优化，可以在存储引擎层提前过滤数据，减少回表次数。

-- 没有ICP的查询执行：
SELECT * FROM employees 
WHERE department = 'IT' AND salary > 5000 AND name LIKE 'John%';-- 执行流程：
1. 使用索引(department)找到所有'IT'部门的记录
2. 对每条记录回表读取完整数据
3. 在Server层过滤 salary > 5000 AND name LIKE 'John%'-- 有ICP的查询执行：
1. 使用索引(department)找到所有'IT'部门的记录
2. 在存储引擎层直接过滤 salary > 5000（如果salary在索引中）
3. 只对过滤后的记录回表
4. 在Server层过滤 name LIKE 'John%'

ICP启用与监控

-- 查看ICP状态
SHOW VARIABLES LIKE 'optimizer_switch';
-- 查找index_condition_pushdown=on-- 使用EXPLAIN查看ICP使用情况
EXPLAIN SELECT * FROM employees 
WHERE department = 'IT' AND salary > 5000;-- 在Extra列看到"Using index condition"表示使用了ICP

第三部分：索引优化实战指南

3.1 索引设计原则

选择性原则

-- 计算索引选择性
SELECT COUNT(DISTINCT column_name) as distinct_values,COUNT(*) as total_rows,ROUND(COUNT(DISTINCT column_name) * 100.0 / COUNT(*), 2) as selectivity_percent
FROM table_name;-- 选择性判断标准：
-- > 20%: 高选择性，适合建索引
-- 5%-20%: 中等选择性，酌情考虑
-- < 5%: 低选择性，通常不适合建索引-- 示例：性别字段选择性通常很差
SELECT COUNT(DISTINCT gender) as distinct_genders,  -- 通常2-3个COUNT(*) as total_users,ROUND(COUNT(DISTINCT gender) * 100.0 / COUNT(*), 2) as selectivity
FROM users;

索引设计检查清单

-- 1. 为WHERE子句的列创建索引
SELECT * FROM products WHERE category_id = 5 AND price > 100;
-- → 索引: (category_id, price)-- 2. 为JOIN条件的列创建索引
SELECT * FROM orders o JOIN users u ON o.user_id = u.id;
-- → orders.user_id和users.id应该有索引-- 3. 为ORDER BY/GROUP BY的列创建索引
SELECT category, COUNT(*) FROM products GROUP BY category;
-- → 索引: (category)-- 4. 考虑覆盖索引
-- 如果查询只访问索引包含的列，可以避免回表

3.2 复合索引设计策略

列顺序决策

ESR原则：

Equality（等值查询）列在前
Sort（排序）列在中间
Range（范围查询）列在最后

-- 查询模式：
SELECT * FROM sales 
WHERE region = 'North'          -- 等值查询AND sale_date BETWEEN '2023-01-01' AND '2023-12-31'  -- 范围查询
ORDER BY amount DESC;           -- 排序-- 最佳索引设计：
CREATE INDEX idx_sales_optimized ON sales(region, amount, sale_date);
-- region(等值) → amount(排序) → sale_date(范围)

实战案例：社交网络系统

-- 用户动态表
CREATE TABLE user_posts (id BIGINT AUTO_INCREMENT PRIMARY KEY,user_id BIGINT NOT NULL,content TEXT,created_at DATETIME,visibility ENUM('public', 'private', 'friends'),like_count INT DEFAULT 0
);-- 常见查询模式分析：
-- 1. 查看用户的最新动态
SELECT * FROM user_posts 
WHERE user_id = 123 AND visibility = 'public' 
ORDER BY created_at DESC 
LIMIT 20;-- 2. 查看热门动态
SELECT * FROM user_posts 
WHERE visibility = 'public' AND created_at > '2023-01-01'
ORDER BY like_count DESC, created_at DESC 
LIMIT 50;-- 索引设计：
CREATE INDEX idx_user_visibility_created ON user_posts(user_id, visibility, created_at);
CREATE INDEX idx_visibility_likes_created ON user_posts(visibility, like_count, created_at);

3.3 索引性能分析与优化

EXPLAIN深度解析

-- 使用EXPLAIN分析查询执行计划
EXPLAIN FORMAT=JSON 
SELECT u.name, COUNT(o.id) as order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2023-01-01'
GROUP BY u.id
HAVING order_count > 5
ORDER BY order_count DESC;-- 关键指标解读：
-- type: 访问类型（const > eq_ref > ref > range > index > ALL）
-- key: 实际使用的索引
-- rows: 预估扫描行数
-- Extra: 额外信息（Using index, Using where, Using filesort等）

索引使用情况监控

-- 查看索引使用统计
SELECT OBJECT_SCHEMA,OBJECT_NAME,INDEX_NAME,COUNT_READ,COUNT_FETCH,COUNT_INSERT,COUNT_UPDATE,COUNT_DELETE
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE OBJECT_SCHEMA = 'your_database'
ORDER BY COUNT_READ DESC;-- 识别未使用的索引
SELECT TABLE_NAME,INDEX_NAME,INDEX_TYPE
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
AND INDEX_NAME != 'PRIMARY'
AND TABLE_NAME NOT IN (SELECT OBJECT_NAME FROM performance_schema.table_io_waits_summary_by_index_usage WHERE INDEX_NAME IS NOT NULL AND COUNT_READ > 0
);

3.4 常见索引问题与解决方案

问题1：索引失效场景

-- 1. 函数操作导致索引失效
SELECT * FROM users WHERE UPPER(name) = 'JOHN';  -- ❌ 索引失效
SELECT * FROM users WHERE name = 'John';         -- ✅ 使用索引-- 2. 隐式类型转换
SELECT * FROM users WHERE id = '100';  -- ❌ 字符串转数字，可能失效
SELECT * FROM users WHERE id = 100;    -- ✅ 使用索引-- 3. 前导通配符
SELECT * FROM users WHERE name LIKE '%John%';  -- ❌ 索引失效
SELECT * FROM users WHERE name LIKE 'John%';   -- ✅ 使用索引-- 4. OR条件优化
SELECT * FROM users WHERE id = 1 OR email = 'test@example.com';  -- ❌ 可能全表扫描
-- 优化为：
SELECT * FROM users WHERE id = 1 
UNION ALL 
SELECT * FROM users WHERE email = 'test@example.com' AND id != 1;

问题2：索引过多的影响

-- 索引过多的负面影响：
-- 1. 写操作变慢（每次INSERT/UPDATE/DELETE都要更新所有索引）
-- 2. 占用更多磁盘空间
-- 3. 优化器选择困难，可能选错索引-- 查看表的索引数量
SELECT TABLE_NAME,COUNT(*) as index_count,GROUP_CONCAT(INDEX_NAME) as index_names
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
GROUP BY TABLE_NAME
HAVING COUNT(*) > 5;  -- 索引过多的表

问题3：索引碎片化

-- 检查索引碎片
SELECT TABLE_NAME,INDEX_NAME,ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 2) AS total_size_mb,ROUND(DATA_FREE / 1024 / 1024, 2) AS free_size_mb,ROUND((DATA_FREE * 100.0 / (DATA_LENGTH + INDEX_LENGTH)), 2) AS frag_percent
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database'
AND DATA_FREE > 1024 * 1024 * 100;  -- 碎片超过100MB的表-- 重建索引消除碎片
ALTER TABLE table_name ENGINE=InnoDB;  -- 重建表
-- 或
ALTER TABLE table_name DROP INDEX index_name, ADD INDEX index_name(columns);

第四部分：高级索引技术与实战案例

4.1 覆盖索引优化

覆盖索引原理

-- 创建覆盖索引
CREATE INDEX idx_covering ON orders(user_id, status, amount, created_at);-- 覆盖索引查询
EXPLAIN SELECT user_id, status, amount 
FROM orders 
WHERE user_id = 100 AND status = 'completed';
-- Extra列显示: "Using index" → 不需要回表-- 非覆盖索引查询
EXPLAIN SELECT * FROM orders 
WHERE user_id = 100 AND status = 'completed';
-- 需要回表读取完整数据

覆盖索引设计模式

-- 模式1：包含所有查询字段
CREATE INDEX idx_covering_all ON table(col1, col2, col3, col4);-- 模式2：包含WHERE和SELECT字段
CREATE INDEX idx_covering_where_select ON table(where_col1, where_col2, select_col1, select_col2);-- 模式3：包含WHERE、ORDER BY和SELECT字段
CREATE INDEX idx_covering_complete ON table(where_col1, where_col2, order_col1, select_col1);

4.2 索引排序优化

利用索引避免filesort

-- 索引: (department, salary, name)-- ✅ 可以利用索引排序的查询：
SELECT * FROM employees 
WHERE department = 'IT' 
ORDER BY salary, name;  -- 排序顺序与索引一致SELECT * FROM employees 
WHERE department = 'IT' AND salary > 5000
ORDER BY salary DESC, name DESC;  -- 排序方向一致-- ❌ 无法利用索引排序的查询：
SELECT * FROM employees 
WHERE department = 'IT' 
ORDER BY name, salary;  -- 排序顺序与索引不一致SELECT * FROM employees 
WHERE department = 'IT' 
ORDER BY salary ASC, name DESC;  -- 排序方向不一致

4.3 分区表索引优化

分区表索引策略

-- 创建分区表
CREATE TABLE sales (id INT AUTO_INCREMENT,sale_date DATE,region VARCHAR(50),amount DECIMAL(10,2),PRIMARY KEY (id, sale_date)
) PARTITION BY RANGE (YEAR(sale_date)) (PARTITION p2020 VALUES LESS THAN (2021),PARTITION p2021 VALUES LESS THAN (2022),PARTITION p2022 VALUES LESS THAN (2023),PARTITION p2023 VALUES LESS THAN (2024)
);-- 分区表索引设计
CREATE INDEX idx_sales_region ON sales(region);  -- 全局索引
CREATE INDEX idx_sales_date ON sales(sale_date); -- 本地索引（每个分区独立）-- 分区裁剪 + 索引使用
EXPLAIN SELECT * FROM sales 
WHERE sale_date BETWEEN '2023-01-01' AND '2023-12-31'
AND region = 'North';
-- 只访问p2023分区，并使用idx_sales_region索引

4.4 实战案例：电商系统索引优化

原始表结构

CREATE TABLE orders (id BIGINT AUTO_INCREMENT PRIMARY KEY,user_id BIGINT NOT NULL,order_number VARCHAR(50) NOT NULL,total_amount DECIMAL(10,2) NOT NULL,status ENUM('pending','paid','shipped','delivered','cancelled'),created_at DATETIME,updated_at DATETIME,INDEX idx_user_id (user_id),INDEX idx_created (created_at)
);

查询模式分析

-- 查询1：用户订单列表
SELECT * FROM orders 
WHERE user_id = 100 
AND status IN ('paid','shipped','delivered')
ORDER BY created_at DESC 
LIMIT 20;-- 查询2：订单统计
SELECT DATE(created_at) as order_date, COUNT(*) as order_count
FROM orders 
WHERE created_at BETWEEN '2023-01-01' AND '2023-12-31'
AND status = 'paid'
GROUP BY DATE(created_at);-- 查询3：订单搜索
SELECT * FROM orders 
WHERE order_number LIKE 'ORD2023%'
AND status = 'paid'
ORDER BY created_at DESC;

优化后的索引设计

-- 删除原有索引
DROP INDEX idx_user_id ON orders;
DROP INDEX idx_created ON orders;-- 创建优化后的复合索引
CREATE INDEX idx_user_status_created ON orders(user_id, status, created_at);
CREATE INDEX idx_created_status ON orders(created_at, status);
CREATE INDEX idx_order_number_status ON orders(order_number, status);-- 添加覆盖索引用于统计查询
CREATE INDEX idx_covering_stats ON orders(created_at, status, id);

优化效果对比

优化前：

-- 查询1执行计划：
-- type: ref (使用idx_user_id)
-- Extra: Using where; Using filesort (需要额外排序)-- 查询2执行计划：  
-- type: range (使用idx_created)
-- Extra: Using where; Using temporary (需要临时表)

优化后：

-- 查询1执行计划：
-- type: ref (使用idx_user_status_created)
-- Extra: Using where (索引已经排序，不需要filesort)-- 查询2执行计划：
-- type: range (使用idx_created_status)
-- Extra: Using where (直接使用索引统计)

第五部分：索引监控与维护体系

5.1 索引监控系统

关键性能指标监控

-- 1. 索引使用频率监控
SELECT OBJECT_NAME,INDEX_NAME,COUNT_FETCH,COUNT_READ,COUNT_INSERT,COUNT_UPDATE,COUNT_DELETE
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE OBJECT_SCHEMA = 'your_database'
ORDER BY COUNT_READ DESC;-- 2. 索引选择性监控
SELECT TABLE_NAME,INDEX_NAME,SEQ_IN_INDEX,COLUMN_NAME,CARDINALITY
FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
AND CARDINALITY > 0;-- 3. 索引大小监控
SELECT TABLE_NAME,INDEX_NAME,ROUND(INDEX_LENGTH / 1024 / 1024, 2) AS index_size_mb
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database'
ORDER BY INDEX_LENGTH DESC;

5.2 自动化维护脚本

-- 索引碎片检查与重建
DELIMITER $$CREATE PROCEDURE maintain_indexes()
BEGINDECLARE done INT DEFAULT FALSE;DECLARE tbl_name VARCHAR(64);DECLARE idx_name VARCHAR(64);DECLARE frag_rate DECIMAL(5,2);-- 游标：查找碎片率超过30%的索引DECLARE cur CURSOR FOR SELECT TABLE_NAME, INDEX_NAME,ROUND(DATA_FREE * 100.0 / (DATA_LENGTH + INDEX_LENGTH), 2) as fragmentationFROM information_schema.TABLESWHERE TABLE_SCHEMA = DATABASE()AND DATA_FREE > 1024 * 1024 * 10  -- 碎片超过10MBAND ROUND(DATA_FREE * 100.0 / (DATA_LENGTH + INDEX_LENGTH), 2) > 30;DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;OPEN cur;read_loop: LOOPFETCH cur INTO tbl_name, idx_name, frag_rate;IF done THENLEAVE read_loop;END IF;-- 重建索引SET @sql = CONCAT('ALTER TABLE ', tbl_name, ' ENGINE=InnoDB');PREPARE stmt FROM @sql;EXECUTE stmt;DEALLOCATE PREPARE stmt;-- 记录维护日志INSERT INTO index_maintenance_log VALUES (NOW(), tbl_name, idx_name, frag_rate, 'REBUILT');END LOOP;CLOSE cur;
END$$DELIMITER ;