当前位置：首页 > news >正文

【linux kernel 常用数据结构和设计模式】【数据结构 1】【如何表达数据之间的一对一、一对多、多对多关系】

news 2025/9/8 8:49:44

一、引言

程序＝数据结构＋算法，数据结构指的是数据与数据之间的逻辑关系，算法指的是解决特定问题的步骤和方法。逻辑关系本身就是现实的反映与抽象，理解逻辑关系是理解程序的关键，本节讲述实现一对一、一对多和多对多关系在用户态和内核态常用数据结构和方法。

二、一对一

2.1 内嵌

struct test_A
{struct test_B b;  
};

特点：

A 中包含了 B 对象，是一种包含关系。适合有A 必须就有 B 对象存在的情况。
- 例如人这个类，里面必然包含眼睛、鼻子、嘴、胳膊、腿
将一对一关系体现得比较清晰，同时也方便管理，因为通过 container_of 宏可以由b计算出 a 的位置，同时节省了指针的空间。
但是针对一对一关系中，可有可无这种情况，是表达不出来的。
- 例如人和驾驶证的关系使用内嵌的结构体是无法表达的。
  - 一本驾驶证只属于一个人
  - 不是每个人都有驾驶证

2.2 指针


struct test_A
{struct test_B *b;  
};struct test_B
{struct test_A *a;  
};

特点：

A -> b 或者 B -> a , 这也是一种一对一的表达，但这种表达，可以表达出可有可无的关系。
使用指针来表达人和驾驶本的关系就很好表达
- 如果一个人没有驾驶本那就是 null

2.3 总结

“一对一”规定了 如果存在关系，那必须是唯一的。
“可有可无”则说明了 这种关系不是创建主实体时的必要条件。它可以后来才被添加，也可能永远不被添加。

如果在设计中，需要存在 A 必须存在 B ：此时推荐内嵌的方式
如果在设计中，A 和 B 是一种一对一的关系，但是这种关系是一种弱弱的，可有可无：此时推荐指针

安全性是程序的首要要求，而合理的逻辑是安全性很重要的前提。所以在设计中一定要表达清楚设计的逻辑。违背逻辑的程序即使当前是安全的，在程序不断维护和更新的过程中，伴随着人员的变动，迟早会暴雷。

三、一对多

在 c 语言中，表达一对多的关系，有数组、有单向链表、有双向链表、有树、红黑树，还有基数树(radix) 。

本节分别从应用态和内核态来看看他们是如何实现和表达的。

3.1 应用态

1. 数组 (Array)

#include <stdio.h>
#include <stdlib.h>#define MAX_CHILDREN 10// 父节点结构
struct Parent {int id;int children[MAX_CHILDREN];  // 子节点数组int child_count;
};void test_array() {printf("=== 数组实现一对多关系 ===\n");struct Parent p = {1, {0}, 0};// 添加子节点p.children[p.child_count++] = 101;p.children[p.child_count++] = 102;p.children[p.child_count++] = 103;printf("父节点 %d 的子节点: ", p.id);for (int i = 0; i < p.child_count; i++) {printf("%d ", p.children[i]);}printf("\n\n");
}int main() {test_array();return 0;
}

优点：

内存连续，访问速度快（O(1)随机访问）
实现简单，易于理解
内存开销小（无指针开销）

缺点：

固定大小，需要预先知道最大子节点数量
插入/删除效率低（O(n)）
可能浪费内存（分配过多）或内存不足

2. 单向链表 (Singly Linked List)

#include <stdio.h>
#include <stdlib.h>// 子节点结构
struct Child {int id;struct Child* next;
};// 父节点结构
struct Parent {int id;struct Child* first_child;
};void add_child(struct Parent* parent, int child_id) {struct Child* new_child = malloc(sizeof(struct Child));new_child->id = child_id;new_child->next = NULL;if (parent->first_child == NULL) {parent->first_child = new_child;} else {struct Child* current = parent->first_child;while (current->next != NULL) {current = current->next;}current->next = new_child;}
}void test_singly_linked_list() {printf("=== 单向链表实现一对多关系 ===\n");struct Parent p = {1, NULL};add_child(&p, 201);add_child(&p, 202);add_child(&p, 203);printf("父节点 %d 的子节点: ", p.id);struct Child* current = p.first_child;while (current != NULL) {printf("%d ", current->id);current = current->next;}printf("\n\n");
}int main() {test_singly_linked_list();return 0;
}

优点：

动态大小，无需预先分配
插入/删除操作相对高效（O(1)在头部，O(n)在尾部）
内存使用灵活

缺点：

只能单向遍历
需要额外的指针存储空间
访问特定位置需要O(n)时间

3. 双向链表 (Doubly Linked List)

#include <stdio.h>
#include <stdlib.h>// 子节点结构
struct Child {int id;struct Child* prev;struct Child* next;
};// 父节点结构
struct Parent {int id;struct Child* first_child;struct Child* last_child;
};void add_child(struct Parent* parent, int child_id) {struct Child* new_child = malloc(sizeof(struct Child));new_child->id = child_id;new_child->prev = parent->last_child;new_child->next = NULL;if (parent->first_child == NULL) {parent->first_child = new_child;} else {parent->last_child->next = new_child;}parent->last_child = new_child;
}void test_doubly_linked_list() {printf("=== 双向链表实现一对多关系 ===\n");struct Parent p = {1, NULL, NULL};add_child(&p, 301);add_child(&p, 302);add_child(&p, 303);// 正向遍历printf("正向遍历: ");struct Child* current = p.first_child;while (current != NULL) {printf("%d ", current->id);current = current->next;}// 反向遍历printf("\n反向遍历: ");current = p.last_child;while (current != NULL) {printf("%d ", current->id);current = current->prev;}printf("\n\n");
}int main() {test_doubly_linked_list();return 0;
}

优点：

可以双向遍历
删除节点更高效（已知节点时O(1)）
某些操作更方便

缺点：

每个节点需要两个指针，内存开销更大
实现更复杂
维护指针关系需要更多操作

4. 树

1. 二叉树 (Binary Tree)

1. 树的原理：家族族谱

想象一下：你有一个大家族

根节点：就像你的曾祖父，是整个家族的起源
父节点：你的爸爸
子节点：你和你的兄弟姐妹
叶子节点：没有孩子的人（比如你的孩子还没出生）

为什么用树？

快速找东西：不用一个一个问"你是XXX吗？“，而是问"你是老大家的孩子吗？” → “你是老大的老二吗？”
层次清晰：谁是谁的爸爸，谁是谁的孩子，一目了然

      爷爷/  \爸爸  叔叔/  \     \你   弟弟   堂弟

2. 案例:

#include <stdio.h>
#include <stdlib.h>struct TreeNode {int id;struct TreeNode* left;struct TreeNode* right;
};struct Parent {int id;struct TreeNode* child_tree;
};struct TreeNode* insert(struct TreeNode* node, int id) {if (node == NULL) {struct TreeNode* new_node = malloc(sizeof(struct TreeNode));new_node->id = id;new_node->left = new_node->right = NULL;return new_node;}if (id < node->id) {node->left = insert(node->left, id);} else if (id > node->id) {node->right = insert(node->right, id);}return node;
}void inorder_traversal(struct TreeNode* node) {if (node != NULL) {inorder_traversal(node->left);printf("%d ", node->id);inorder_traversal(node->right);}
}void test_binary_tree() {printf("=== 二叉树实现一对多关系 ===\n");struct Parent p = {1, NULL};p.child_tree = insert(p.child_tree, 401);p.child_tree = insert(p.child_tree, 402);p.child_tree = insert(p.child_tree, 403);p.child_tree = insert(p.child_tree, 400);printf("中序遍历结果: ");inorder_traversal(p.child_tree);printf("\n\n");
}int main() {test_binary_tree();return 0;
}

$ gcc test.c ; ./a.out 
=== 二叉树实现一对多关系 ===
中序遍历结果: 400 401 402 403

二叉树和双向链表数据结构对比

//---------------------- 双向链表
// 子节点结构
struct Child {int id;struct Child* prev;struct Child* next;
};// 父节点结构
struct Parent {int id;struct Child* first_child;struct Child* last_child;
};//--------------------- 二叉树
struct TreeNode {int id;struct TreeNode* left;struct TreeNode* right;
};struct Parent {int id;struct TreeNode* child_tree;
};

优点：

快速搜索（O(log n)平均情况）
有序存储
内存使用相对高效

缺点：

可能退化为链表（O(n)最坏情况）
实现相对复杂
需要平衡机制来保证性能

2. 红黑树 (Red-Black Tree)

1. 红黑树的原理：严格但公平的家族规矩

想象一个特别讲究的大家族，有这些家规：

根必须是黑色：族长必须是德高望重的人（黑色）
红色不能连红色：红衣服的人不能挨着坐（防止两个红色节点相邻）
每条路黑色一样多：从家族任何一个人到最远的后代，经过的"黑衣服"人数必须相同

为什么要这些规矩？
为了让家族不会出现"一边倒"的情况：

不会所有人都挤在一边（树不会退化成链表）
找任何人都很快（平均和最坏情况都是O(log n)）

怎么维持规矩？
当新成员加入（插入节点）时：

新来的先穿红衣服（新节点总是红色）
如果违反了"红衣服不能挨着坐"，就：
- 换衣服颜色（改变节点颜色）
- 调整座位（旋转操作）

例子：

插入顺序：7, 3, 18, 10, 22, 8, 11最终树结构：10(黑)/     \7(红)    18(红)/  \     /   \3(黑) 8(黑) 11(黑) 22(黑)

2. 案例:

#include <stdio.h>
#include <stdlib.h>typedef enum { RED, BLACK } Color;typedef struct RBNode {int key;Color color;struct RBNode *left, *right, *parent;
} RBNode;RBNode* create_node(int key) {RBNode* node = (RBNode*)malloc(sizeof(RBNode));node->key = key;node->color = RED; // 新节点总是红色node->left = node->right = node->parent = NULL;return node;
}// 左旋操作
void left_rotate(RBNode** root, RBNode* x) {RBNode* y = x->right;x->right = y->left;if (y->left != NULL)y->left->parent = x;y->parent = x->parent;if (x->parent == NULL)*root = y;else if (x == x->parent->left)x->parent->left = y;elsex->parent->right = y;y->left = x;x->parent = y;
}// 右旋操作
void right_rotate(RBNode** root, RBNode* y) {RBNode* x = y->left;y->left = x->right;if (x->right != NULL)x->right->parent = y;x->parent = y->parent;if (y->parent == NULL)*root = x;else if (y == y->parent->left)y->parent->left = x;elsey->parent->right = x;x->right = y;y->parent = x;
}// 修复红黑树性质
void fix_violation(RBNode** root, RBNode* z) {while (z != *root && z->parent->color == RED) {RBNode* grandparent = z->parent->parent;if (z->parent == grandparent->left) {RBNode* uncle = grandparent->right;if (uncle != NULL && uncle->color == RED) {// Case 1: 叔叔节点是红色z->parent->color = BLACK;uncle->color = BLACK;grandparent->color = RED;z = grandparent;} else {if (z == z->parent->right) {// Case 2: z是右孩子z = z->parent;left_rotate(root, z);}// Case 3: z是左孩子z->parent->color = BLACK;grandparent->color = RED;right_rotate(root, grandparent);}} else {// 对称的情况RBNode* uncle = grandparent->left;if (uncle != NULL && uncle->color == RED) {z->parent->color = BLACK;uncle->color = BLACK;grandparent->color = RED;z = grandparent;} else {if (z == z->parent->left) {z = z->parent;right_rotate(root, z);}z->parent->color = BLACK;grandparent->color = RED;left_rotate(root, grandparent);}}}(*root)->color = BLACK; // 根节点总是黑色
}// 插入节点
void rb_insert(RBNode** root, int key) {RBNode* z = create_node(key);RBNode* y = NULL;RBNode* x = *root;// 标准的BST插入while (x != NULL) {y = x;if (z->key < x->key)x = x->left;elsex = x->right;}z->parent = y;if (y == NULL)*root = z;else if (z->key < y->key)y->left = z;elsey->right = z;fix_violation(root, z);
}// 中序遍历
void inorder(RBNode* root) {if (root == NULL) return;inorder(root->left);printf("%d(%s) ", root->key, root->color == RED ? "RED" : "BLACK");inorder(root->right);
}void test_red_black_tree() {printf("=== 红黑树实现一对多关系 ===\n");RBNode* root = NULL;int keys[] = {7, 3, 18, 10, 22, 8, 11, 26};int n = sizeof(keys)/sizeof(keys[0]);for (int i = 0; i < n; i++) {rb_insert(&root, keys[i]);}printf("中序遍历结果: ");inorder(root);printf("\n\n");
}int main() {test_red_black_tree();return 0;
}

$ gcc test.c ; ./a.out 
=== 红黑树实现一对多关系 ===
中序遍历结果: 3(BLACK) 7(BLACK) 8(RED) 10(BLACK) 11(RED) 18(RED) 22(BLACK) 26(RED)

红黑树优点：

保证树的高度为O(log n)，搜索效率稳定
插入、删除、搜索的时间复杂度都是O(log n)
相比AVL树，插入删除操作更少旋转

红黑树缺点：

实现复杂，维护成本高
常数因子比AVL树稍大
需要存储颜色信息，内存开销稍大

适用场景：

C++ STL的map、set底层实现
Linux内核调度器
数据库索引
需要保证最坏情况性能的场景

3. 基数树 (Radix Tree)

1. 基数树的原理：共享开头的单词本

想象一本超级聪明的字典：

普通字典：每个单词单独写

apple
application
apply
banana
band

基数树字典：共享相同的开头部分

a
|- pple
|- pplic|- ation|- y
b
|- anana
|- and

工作原理：

找共同开头：新单词来了，先看看和已有单词有多少相同开头
能共享就共享：如果开头相同，就共享这部分
不能共享就分开：开头不同就新开一个分支

插入过程例子：
已有："apple"
插入："application"

发现共同开头："app"
拆分"apple"为："app" + "le"
新单词："app" + "lication"
现在结构：

app
|- le (apple的剩余)
|- lication (application的剩余)

再插入"apply"：

共同开头："app"
已有："app" → "le"和"lication"
"apply"和"lication"的共同开头："l"
继续拆分…

2. 案例

#include <stdio.h>
#include <stdlib.h>
#include <string.h>#define MAX_EDGE_LEN 16typedef struct RadixNode {char* edge; // 边标签int is_end; // 是否是一个键的结束struct RadixNode** children;int child_count;
} RadixNode;RadixNode* create_radix_node(const char* edge, int is_end) {RadixNode* node = (RadixNode*)malloc(sizeof(RadixNode));node->edge = strdup(edge);node->is_end = is_end;node->children = NULL;node->child_count = 0;return node;
}// 查找最长公共前缀长度
int find_common_prefix(const char* s1, const char* s2) {int i = 0;while (s1[i] && s2[i] && s1[i] == s2[i]) {i++;}return i;
}// 插入键到基数树
void radix_insert(RadixNode** root, const char* key) {if (*root == NULL) {*root = create_radix_node(key, 1);return;}RadixNode* current = *root;const char* remaining = key;while (1) {int found = 0;// 检查所有子节点for (int i = 0; i < current->child_count; i++) {RadixNode* child = current->children[i];int common = find_common_prefix(remaining, child->edge);if (common > 0) {if (common == strlen(child->edge)) {// 完全匹配当前边if (common == strlen(remaining)) {// 完全匹配键child->is_end = 1;return;}// 继续在当前子节点中搜索remaining += common;current = child;found = 1;break;} else {// 部分匹配，需要分裂节点char* common_part = strndup(child->edge, common);char* rest_child = strdup(child->edge + common);char* rest_key = strdup(remaining + common);// 创建新节点RadixNode* new_node = create_radix_node(common_part, 0);free(common_part);// 更新原有节点free(child->edge);child->edge = rest_child;// 如果键剩余部分为空if (strlen(rest_key) == 0) {new_node->is_end = 1;// 添加原有子节点为新节点的子节点new_node->children = (RadixNode**)malloc(sizeof(RadixNode*));new_node->children[0] = child;new_node->child_count = 1;} else {// 创建新子节点RadixNode* new_child = create_radix_node(rest_key, 1);free(rest_key);// 设置新节点的子节点new_node->children = (RadixNode**)malloc(2 * sizeof(RadixNode*));new_node->children[0] = child;new_node->children[1] = new_child;new_node->child_count = 2;}// 替换当前子节点current->children[i] = new_node;return;}}}if (!found) {// 没有公共前缀，添加新子节点current->children = (RadixNode**)realloc(current->children, (current->child_count + 1) * sizeof(RadixNode*));current->children[current->child_count] = create_radix_node(remaining, 1);current->child_count++;return;}}
}// 搜索键
int radix_search(RadixNode* root, const char* key) {if (root == NULL) return 0;RadixNode* current = root;const char* remaining = key;while (1) {int found = 0;for (int i = 0; i < current->child_count; i++) {RadixNode* child = current->children[i];int common = find_common_prefix(remaining, child->edge);if (common == strlen(child->edge)) {if (common == strlen(remaining)) {return child->is_end;}remaining += common;current = child;found = 1;break;}}if (!found) return 0;}
}// 打印基数树
void print_radix_tree(RadixNode* node, int depth) {if (node == NULL) return;for (int i = 0; i < depth; i++) printf("  ");printf("'%s'%s\n", node->edge, node->is_end ? " ✓" : "");for (int i = 0; i < node->child_count; i++) {print_radix_tree(node->children[i], depth + 1);}
}void test_radix_tree() {printf("=== 基数树实现一对多关系 ===\n");RadixNode* root = NULL;char* keys[] = {"romane", "romanus", "romulus", "rubens", "ruber", "rubicon", "rubicundus"};int n = sizeof(keys)/sizeof(keys[0]);for (int i = 0; i < n; i++) {radix_insert(&root, keys[i]);}printf("基数树结构:\n");print_radix_tree(root, 0);printf("\n搜索测试:\n");printf("搜索 'romane': %s\n", radix_search(root, "romane") ? "找到" : "未找到");printf("搜索 'roman': %s\n", radix_search(root, "roman") ? "找到" : "未找到");printf("搜索 'rubicon': %s\n", radix_search(root, "rubicon") ? "找到" : "未找到");printf("\n");
}int main() {test_radix_tree();return 0;
}

$ gcc test.c ; ./a.out 
=== 基数树实现一对多关系 ===
基数树结构:
'romane' ✓'r''om''anus' ✓'ulus' ✓'ub''e''ns' ✓'r' ✓'ic''on' ✓'undus' ✓搜索测试:
搜索 'romane': 未找到
搜索 'roman': 未找到
搜索 'rubicon': 找到

基数树优点：

空间效率高，共享公共前缀
搜索时间复杂度O(k)，k是键的长度
适合字符串键的存储和搜索
支持前缀搜索和范围查询

基数树缺点：

实现复杂，特别是插入操作
对非字符串键不友好
内存管理复杂，需要频繁分配释放

适用场景：

IP路由表（最长前缀匹配）
字典实现
自动补全系统
字符串搜索和匹配
基因组序列分析

4. 总结对比

1. 普通树

像普通家族：没什么规矩，可能长得歪歪扭扭
优点：简单自由
缺点：可能效率很低

2. 红黑树

像贵族家族：严格家规，保持优雅平衡
优点：永远保持高效，不会退化
缺点：规矩多，实现复杂
用在：C++的map/set，数据库索引

3. 基数树

像聪明词典：共享开头，节省空间
优点：空间效率高，特别适合字符串
缺点：实现很复杂
用在：IP路由表，输入法联想，搜索引擎

4. 什么时候用什么？

一般需求：用普通树或红黑树（现成的库）
处理字符串：考虑基数树
需要前缀搜索（比如输入"app"显示所有app开头的词）：必须用基数树
保证最坏情况性能：用红黑树

记住：红黑树是平衡大师，基数树是空间节省专家！

3.2 内核态

在linux 内核中使用如下结构表达一对多关系:

list_head：一个对象挂到多个链表（例如 task_struct 有多个 list_head 字段）。
hlist：适合做哈希桶（一个 key → 多个对象）。
rbtree：一棵树里挂很多节点（索引结构）。
xarray：一个 index（key）可以存储指针 → 指向一个 list 或其他集合，从而实现一对多。

例如：

一个 进程 (task_struct) → 多个 文件 (file list) → list_head
一个 哈希 key → 多个 socket → hlist
一个 时间戳 → 多个 定时器 → rbtree
一个 文件 inode → 多个 page cache 页 → xarray

1. `list_head` —— 双向循环链表

1. 使用场景

内核里最常用的链表结构，用于维护 顺序集合，如：
- 进程链表（task_struct->tasks）
- 内存管理中的页链表（struct page->lru）
- 定时器链表（timer_list）
特点：双向循环，方便插入/删除。

2. 原理

节点定义：

struct list_head {struct list_head *next, *prev;
};

任何结构体里嵌入一个 list_head 字段，就能挂到链表里。
内核提供宏函数如 INIT_LIST_HEAD()、list_add()、list_del()、list_for_each_entry() 等。

保存数据的特点

每个节点内嵌 struct list_head（两个指针：next、prev），链表首尾相连（循环）。
链表只保存节点的“连接关系”，节点中的数据由宿主结构体保存（典型内核风格：container_of）。

是否有先后顺序

有明确的顺序。遍历按照链表指针的顺序返回元素。插入点（头或尾）决定顺序；不会自动排序（除非显式按 key 插入）。

适合的内核场景 & 为什么

需要维持“有序集合”或按插入/自定义顺序遍历的场景：
- 进程队列（task_struct->tasks）、设备列表、LRU 列表、驱动内部队列等。
原因：
- 插入/删除在已知节点处为 O(1)（只改指针），非常轻量。
- 顺序语义直观，代码可读性高。
- 对内存分配和释放比较直接（每个节点通常是宿主结构的一部分）。

性能/复杂度/并发/内存

插入/删除（已知位置）：O(1)；查找/按键搜索：O(n)。
内存：每节点两个指针（通常 16 字节在 64-bit）。
并发：默认不是并发安全的；通常需要自旋锁/互斥/RCU 保护。内核也提供 RCU 版本的访问宏（如 list_for_each_entry_rcu 等）在满足条件下可用于读多写少场景。

典型取舍

优点：简单、低开销、顺序语义清晰。
缺点：不适合随机查找或需要按 key 快速定位的场景。

3. Demo

#include <linux/module.h>
#include <linux/init.h>
#include <linux/list.h>
#include <linux/slab.h>struct mynode {int value;struct list_head list;
};static LIST_HEAD(mylist);static int __init list_demo_init(void)
{int i;struct mynode *node, *tmp;pr_info("list_demo: init\n");for (i = 0; i < 5; i++) {node = kmalloc(sizeof(*node), GFP_KERNEL);node->value = i;list_add_tail(&node->list, &mylist);}list_for_each_entry(node, &mylist, list)pr_info("node->value=%d\n", node->value);// 删除并释放list_for_each_entry_safe(node, tmp, &mylist, list) {list_del(&node->list);kfree(node);}return 0;
}
static void __exit list_demo_exit(void) { pr_info("list_demo: exit\n"); }
module_init(list_demo_init);
module_exit(list_demo_exit);
MODULE_LICENSE("GPL");

2. `hlist` —— 单向哈希链表

1. 使用场景

用于 哈希表的桶，减少指针内存开销。
应用：
- 内核哈希表 API（linux/hashtable.h）
- 网络子系统（conntrack，ARP cache）

2. 原理

节点定义：

struct hlist_node {struct hlist_node *next, **pprev;
};
struct hlist_head {struct hlist_node *first;
};

单向，但有 pprev，删除效率 O(1)。

保存数据的特点

每节点包含 struct hlist_node { struct hlist_node *next; struct hlist_node **pprev; }，头为 struct hlist_head { struct hlist_node *first; }。
pprev 指向前驱节点的 next 指针（即指向指针的指针），从而支持 O(1) 删除而不需要完整双指针。

是否有先后顺序

有“链上顺序”，遍历顺序等同于链接顺序（例如常用 hlist_add_head 会把新节点插到头部，因此遍历顺序通常是“后进先出”除非使用尾插实现）。
但在哈希表中，桶内顺序通常并不被视为“重要的业务顺序”。

适合的内核场景 & 为什么

典型用于哈希桶（DEFINE_HASHTABLE/hash_add 等封装底层使用 hlist），因为：
- 节点比双向链表略节省（没有显式 prev 指针字段，pprev 是指针到指针的存储方式）。
- 哈希桶里单个链通常较短，查找以遍历桶为主，插入/删除快。
常见：ARP 表、inode 哈希、socket/connection 哈希、各种 hash-table 实现。

性能/复杂度/并发/内存

插入/删除：O(1)。查找：O(bucket_size)（均摊接近 O(1) 若哈希足够均匀）。
内存：每节点约等于两个指针大小，但结构更紧凑（next + pprev）。
并发：常跟自旋锁或 RCU 一起使用。内核提供 hash_for_each_rcu 等宏以支持读路径无锁（写路径仍需保护）。

典型取舍

优点：哈希场景自然、内存效率较好、删除 O(1)。
缺点：桶内无排序（如果需要顺序语义需额外维护）。

3. Demo（哈希表）

#include <linux/module.h>
#include <linux/init.h>
#include <linux/hashtable.h>
#include <linux/slab.h>struct mynode {int key;int value;struct hlist_node node;
};DEFINE_HASHTABLE(myhash, 4); // 16 桶static int __init hlist_demo_init(void)
{int i;struct mynode *n;struct hlist_node *tmp;pr_info("hlist_demo: init\n");for (i = 0; i < 5; i++) {n = kmalloc(sizeof(*n), GFP_KERNEL);n->key = i;n->value = i * 10;hash_add(myhash, &n->node, n->key);}hash_for_each_safe(myhash, i, tmp, n, node)pr_info("key=%d, value=%d\n", n->key, n->value);hash_for_each_safe(myhash, i, tmp, n, node) {hash_del(&n->node);kfree(n);}return 0;
}
static void __exit hlist_demo_exit(void) { pr_info("hlist_demo: exit\n"); }
module_init(hlist_demo_init);
module_exit(hlist_demo_exit);
MODULE_LICENSE("GPL");

3. `rbtree` —— 红黑树

1. 使用场景

平衡二叉搜索树，支持 O(logN) 查找。
应用：
- 虚拟内存区间管理（mm->mm_rb）
- 调度器 CFS 红黑树（sched_entity->run_node）
- 定时器红黑树

2. 原理

内核定义：

struct rb_node {unsigned long __rb_parent_color;struct rb_node *rb_right;struct rb_node *rb_left;
};
struct rb_root { struct rb_node *rb_node; };

提供 rb_link_node() + rb_insert_color() 插入平衡，rb_erase() 删除。

保存数据的特点

按 key 值组织成二叉搜索树，且保持红黑平衡性质（插入/删除后旋转和重新着色）。
节点有左右子指针和父+颜色位（内核用 __rb_parent_color 压缩 parent 与 color）。

是否有先后顺序

以 key 的大小顺序严格有序。中序遍历会按 key 升序返回节点。
适合需要按 key 排序或区间查询的场景（例如按地址、按时间戳等）。

适合的内核场景 & 为什么

需要 排序访问 或 范围/区间查询 的场景：
- 虚拟内存区间（VMA，按 start address 排序）
- 某些以 key 排序或需快速定位前驱/后继的结构（调度器的某些实现、时间轮/定时器替代实现）
原因：
- 查找、插入、删除均为 O(log n)，且支持找到前驱/后继（快速实现范围查找）。
- 比较适合动态且需要保持有序性的集合。

性能/复杂度/并发/内存

查找/插入/删除：O(log n)。
内存：每节点至少两个子指针 + parent+color（相当于 3 个指针存储，颜色位压缩到 parent 字段）。
并发：rbtree 本身无内置并发控制，通常由调用者用锁保护（例如 mmap 的读写锁、spinlock 等）。存在基于 RCU 的变体/技巧，但不是默认行为。

典型取舍

优点：保持全局有序、支持范围操作与前驱/后继查询。
缺点：比链表/哈希实现复杂；随机访问不如 xarray（按索引）直观。

3. Demo

#include <linux/module.h>
#include <linux/init.h>
#include <linux/rbtree.h>
#include <linux/slab.h>struct mynode {int key;struct rb_node node;
};static struct rb_root mytree = RB_ROOT;static int __init rbtree_demo_init(void)
{int i;struct mynode *n, *cur;pr_info("rbtree_demo: init\n");for (i = 0; i < 5; i++) {struct rb_node **link = &mytree.rb_node, *parent = NULL;n = kmalloc(sizeof(*n), GFP_KERNEL);n->key = i;while (*link) {parent = *link;cur = rb_entry(parent, struct mynode, node);if (n->key < cur->key)link = &(*link)->rb_left;elselink = &(*link)->rb_right;}rb_link_node(&n->node, parent, link);rb_insert_color(&n->node, &mytree);}for (cur = rb_entry_safe(rb_first(&mytree), struct mynode, node);cur; cur = rb_entry_safe(rb_next(&cur->node), struct mynode, node))pr_info("key=%d\n", cur->key);while ((cur = rb_entry_safe(rb_first(&mytree), struct mynode, node))) {rb_erase(&cur->node, &mytree);kfree(cur);}return 0;
}
static void __exit rbtree_demo_exit(void) { pr_info("rbtree_demo: exit\n"); }
module_init(rbtree_demo_init);
module_exit(rbtree_demo_exit);
MODULE_LICENSE("GPL");

4. `xarray` —— 基数树（Radix tree 的升级版）

1. 使用场景

高效存储 稀疏索引，适合大规模对象。
应用：
- 页缓存（mapping->i_pages）
- IDR/IDA 实现（分配对象 ID）

2. 原理

xarray 基于 radix tree，支持并发 + RCU。
常用 API：xa_store()、xa_load()、xa_erase()、xa_for_each()。

保存数据的特点

按整数索引（xa_store(&xa, index, ptr, GFP_KERNEL)）存储对象，内部类似多路 radix tree 的节点结构以支持非常大的、稀疏的索引空间。
设计上考虑并发（RCU/锁协作）与内存效率，取代很多 radix tree 用例并提供更安全的接口。

是否有先后顺序

有按索引的顺序（xa_for_each 按升序遍历索引）。索引就是隐含的顺序键，不同于链表的“插入顺序”，而是“索引顺序”。

适合的内核场景 & 为什么

需要把对象按整数 index → 对象 映射，且索引空间大但稀疏的场景：
- 页缓存（mapping->i_pages） —— 大量 page 按 page index 存储。
- ID 到对象的映射（IDR/IDA 的替代或底层实现）。
原因：
- 高效支持随机按 index 查找与按 index 区间/顺序遍历（比链表快得多）。
- 内部实现为树形但高度受控，查找遍历非常高效且支持并发读（提供 RCU/原子加载等安全 API）。
- 对稀疏大空间非常节省（不会为每个未用 index 分配内存）。

性能/复杂度/并发/内存

查找/插入/删除复杂度依赖于树高度（通常很浅），表现接近 O(1)~O(log n) 的混合；对页缓存等场景表现优秀。
内存：单个 entry 的额外开销由树节点分配策略决定（并不是简单几个指针），但总体比 naively 用大数组省内存。
并发：xarray 提供并发友好 API，读路径可以使用 RCU 风格的加载，且内部使用锁进行写保护，适合高并发场景。

典型取舍

优点：按索引随机访问快、遍历按 index 有序、内置并发支持。
缺点：实现更复杂；不适合作为按 arbitrary key 排序（整数索引才自然）。

3. Demo

#include <linux/module.h>
#include <linux/init.h>
#include <linux/xarray.h>DEFINE_XARRAY(myxa);#define XARRAY_SIZE 100
static int __init xarray_demo_init(void)
{long unsigned i;void *p;pr_info("xarray_demo: init\n");for (i = 0; i < XARRAY_SIZE; i++)xa_store(&myxa, i, (void *)(long)(i * 456), GFP_KERNEL);xa_for_each(&myxa, i, p)pr_info("index=%ld, value=%ld\n", i, (long)p);for (i = 0; i < XARRAY_SIZE; i++)xa_erase(&myxa, i);return 0;
}
static void __exit xarray_demo_exit(void) { pr_info("xarray_demo: exit\n"); }
module_init(xarray_demo_init);
module_exit(xarray_demo_exit);
MODULE_LICENSE("GPL");

5. 总结

1. 速览表（便于快速选型）

数据结构	顺序语义	查找（按 key）	插入/删除	内存开销（节点）	并发特点	适合场景（示例）
`list_head`	插入/链上顺序（按链接）	O(n)	O(1)（已知节点）	2 指针	需要锁或 RCU 保护读（有 RCU 宏）	设备链表、LRU、进程队列
`hlist`	桶内链上顺序	O(bucket_size)	O(1)	~2 指针（`next` + `pprev`）	常与 hash + spinlock/RCU 一起用	哈希桶（ARP、conntrack、inode hash）
`rbtree`	严格按 key 排序	O(log n)	O(log n)	2 子指针 + parent+color	需外部锁（读/写保护）	VMA、按地址/时间排序的索引
`xarray`	按整数索引顺序	接近 O(1)/O(log n)（树高浅）	接近 O(1)/树高相关	节点化结构（稀疏友好）	内置并发/RCU 友好 API	页缓存（i_pages）、ID → 对象映射

2. 如何选择（实用建议）

需要按插入顺序遍历或双向快速删除/插入 → 用 list_head（简单、直观）。
需要哈希索引，查找按 key 哈希到桶 → 用 hlist（配合内核 hashtable API）。
需要按 key 排序或做区间/前驱后继查询 → 用 rbtree（例如 VMA）。
需要按大整数索引、稀疏且并发读多写少/高并发 → 用 xarray（页缓存、ID 映射）。

3. 关于“一对多”关系的表达

最简单：父对象中放 struct list_head children;，每个子对象嵌入 list_head，父通过链表维护一对多（常见、直观）。
哈希一对多：key → 哈希桶（hlist）里的多个条目。
索引一对多：xarray 的某个 index 存储一个指向链表（或容器）的指针（即 xa_store(xa, idx, children_list_head)），用于每个 index 下挂多个元素。
重复 key 的有序集合：可以在 rbtree 的每个 key 节点中保存一个链表，或者把 (key, subkey) 映射到树中（依据性能需求）。

四、多对多

在 C 语言 里，如果想表达 多对多关系 (many-to-many relationship) 该如何做呢？

例如：
学生和课程是多对多关系，一个学生可以选多门课，一门课可以有多个学生。

4.1 应用态

1. 关系表（二维数组 / 矩阵）

最直接的方式是用 二维数组 表示。

#include <stdio.h>#define MAX_STUDENTS 5
#define MAX_COURSES 4int main() {// 关系矩阵，1 表示有关系（选了课），0 表示无关系int student_course[MAX_STUDENTS][MAX_COURSES] = {0};// 学生 0 选了课程 1 和 2student_course[0][1] = 1;student_course[0][2] = 1;// 学生 2 选了课程 0student_course[2][0] = 1;// 打印学生-课程关系for (int i = 0; i < MAX_STUDENTS; i++) {printf("Student %d -> ", i);for (int j = 0; j < MAX_COURSES; j++) {if (student_course[i][j]) {printf("Course %d ", j);}}printf("\n");}return 0;
}

这种方式直观，适合关系稠密、数量有限的场景。

2. 中间结构体（关联表法）

如果学生和课程很多，矩阵会浪费空间，就用 关联表（类似数据库的多对多中间表）。

#include <stdio.h>
#include <string.h>#define MAX_RELATIONS 20typedef struct {int student_id;int course_id;
} Relation;int main() {Relation relations[MAX_RELATIONS];int count = 0;// 添加关系relations[count++] = (Relation){.student_id = 0, .course_id = 1};relations[count++] = (Relation){.student_id = 0, .course_id = 2};relations[count++] = (Relation){.student_id = 2, .course_id = 0};// 查询：学生 0 选了哪些课printf("Student 0 -> ");for (int i = 0; i < count; i++) {if (relations[i].student_id == 0) {printf("Course %d ", relations[i].course_id);}}printf("\n");return 0;
}

这种方式更灵活，适合关系稀疏，数据量大的情况。

3. 链表 / 动态结构

如果不想预设固定大小，可以用链表或 动态分配内存 来存储关系。

#include <stdio.h>
#include <stdlib.h>// 关系结点
typedef struct RelationNode {int course_id;struct RelationNode* next;
} RelationNode;// 每个学生维护一条链表，表示所选课程
#define MAX_STUDENTS 5
RelationNode* student_courses[MAX_STUDENTS] = {NULL};// 添加关系
void add_relation(int student_id, int course_id) {RelationNode* newNode = malloc(sizeof(RelationNode));newNode->course_id = course_id;newNode->next = student_courses[student_id];student_courses[student_id] = newNode;
}void print_relations(int student_id) {printf("Student %d -> ", student_id);RelationNode* cur = student_courses[student_id];while (cur) {printf("Course %d ", cur->course_id);cur = cur->next;}printf("\n");
}int main() {add_relation(0, 1);add_relation(0, 2);add_relation(2, 0);print_relations(0);print_relations(2);return 0;
}

这种方式可扩展性最好，适合 运行时动态关系。

4. 总结

二维数组（矩阵）：适合稠密关系，关系数量小。
中间结构体数组（关系表）：适合稀疏关系，方便遍历、查找。
链表 / 动态结构：适合动态关系，不固定数量。

4.2 内核态

在内核里，多对多关系常见的几种表达方式：

bitmap/数组矩阵
- 用于数量有限且关系稠密的场景。
- 例子：cpumask_t（进程允许在哪些 CPU 上运行）。
中间表（关系对）
- 用一个结构体数组保存两个对象的配对关系。
- 例子：fdtable（task ↔ file）。
链表
- 每个对象维护链表，链表节点记录与对端的关系。
- 例子：struct inode 和 struct dentry。
哈希表 / Radix Tree / XArray
- 适合大规模映射关系，快速查找。
- 例子：页缓存 (inode ↔ page)。

1. bitmap/二维数组方式

// demo_bitmap.c
#include <linux/module.h>
#include <linux/kernel.h>#define MAX_TASKS 4
#define MAX_CPUS  4static int task_cpu_map[MAX_TASKS][MAX_CPUS];static int __init demo_init(void)
{int i, j;memset(task_cpu_map, 0, sizeof(task_cpu_map));// 模拟绑定关系task_cpu_map[0][0] = 1;task_cpu_map[0][1] = 1;task_cpu_map[2][3] = 1;pr_info("bitmap demo start:\n");for (i = 0; i < MAX_TASKS; i++) {pr_info("Task %d -> ", i);for (j = 0; j < MAX_CPUS; j++) {if (task_cpu_map[i][j])pr_cont("CPU%d ", j);}pr_cont("\n");}return 0;
}static void __exit demo_exit(void)
{pr_info("bitmap demo exit\n");
}module_init(demo_init);
module_exit(demo_exit);
MODULE_LICENSE("GPL");

场景：

内核 cpumask_t 就是这种方式。
用于 稠密映射，元素有限。

2. 中间表方式（关系对数组）

// demo_table.c
#include <linux/module.h>
#include <linux/kernel.h>#define MAX_RELATIONS 10struct relation {int pid;int fd;
};static struct relation relation_table[MAX_RELATIONS];
static int relation_count;static int __init demo_init(void)
{relation_count = 0;relation_table[relation_count++] = (struct relation){100, 3};relation_table[relation_count++] = (struct relation){100, 4};relation_table[relation_count++] = (struct relation){200, 5};pr_info("relation table demo:\n");for (int i = 0; i < relation_count; i++) {pr_info("PID %d <-> FD %d\n", relation_table[i].pid, relation_table[i].fd);}return 0;
}static void __exit demo_exit(void)
{pr_info("relation table demo exit\n");
}module_init(demo_init);
module_exit(demo_exit);
MODULE_LICENSE("GPL");

场景：

fdtable（task ↔ file）。
适合 稀疏关系，可枚举遍历。

3. 链表方式

// demo_list.c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/list.h>struct course {int cid;struct list_head list;
};static LIST_HEAD(student_courses);static int __init demo_init(void)
{struct course *c;int i;// 模拟添加课程for (i = 101; i < 104; i++) {c = kmalloc(sizeof(*c), GFP_KERNEL);c->cid = i;INIT_LIST_HEAD(&c->list);list_add(&c->list, &student_courses);}pr_info("list demo:\n");list_for_each_entry(c, &student_courses, list) {pr_info("Course %d\n", c->cid);}return 0;
}static void __exit demo_exit(void)
{struct course *c, *tmp;list_for_each_entry_safe(c, tmp, &student_courses, list) {list_del(&c->list);kfree(c);}pr_info("list demo exit\n");
}module_init(demo_init);
module_exit(demo_exit);
MODULE_LICENSE("GPL");

场景：

inode ↔ dentry。
适合 动态数量，关系频繁变动。

4. 哈希表方式

// demo_hash.c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/hashtable.h>
#include <linux/slab.h>struct relation {int key1;int key2;struct hlist_node node;
};DEFINE_HASHTABLE(rel_table, 4); // 16 桶static int __init demo_init(void)
{struct relation *r;int i;// 插入关系for (i = 0; i < 5; i++) {r = kmalloc(sizeof(*r), GFP_KERNEL);r->key1 = 10;r->key2 = 100 + i;hash_add(rel_table, &r->node, r->key1);}pr_info("hash demo: relations of key1=10\n");hash_for_each_possible(rel_table, r, node, 10) {pr_info("%d -> %d\n", r->key1, r->key2);}return 0;
}static void __exit demo_exit(void)
{struct relation *r;struct hlist_node *tmp;int bkt;hash_for_each_safe(rel_table, bkt, tmp, r, node) {hash_del(&r->node);kfree(r);}pr_info("hash demo exit\n");
}module_init(demo_init);
module_exit(demo_exit);
MODULE_LICENSE("GPL");

场景：

页缓存 (inode ↔ page)。
cgroup ↔ task。
适合 大规模关系，高效查找。

5. Input 子系统中的多对多关系

在 Linux input 子系统 中，有一个非常典型的 多对多关系：

struct input_dev（输入设备）
- 比如键盘、触摸屏、游戏手柄。
struct input_handler（输入处理器）
- 比如 evdev（提供 /dev/input/eventX）、mousedev（传统鼠标接口）。

关系：

一个 input_dev 可以被多个 input_handler 处理（同一个键盘可被 evdev 和 kbd 同时使用）。
一个 input_handler 可以绑定多个 input_dev（evdev 可以处理所有 input 设备）。

内核实现方式：

使用 双向链表 + 中间对象：
- 中间对象是 struct input_handle，代表一次绑定关系。
- input_dev 里有 struct list_head h_list（挂载所有 handle）。
- input_handler 里有 struct list_head h_list（挂载所有 handle）。
- input_handle 同时挂到设备和 handler 的链表里。

struct input_handle {void *private;int open;const char *name;struct input_dev *dev;struct input_handler *handler;struct list_head d_node; /* 链接到 dev->h_list */struct list_head h_node; /* 链接到 handler->h_list */
};

这样就形成了一个 多对多的双向链表关系。

类比：

input_dev = 学生
input_handler = 课程
input_handle = 选课记录（学生选了某门课）

好处：

查询设备时，可以找到所有 handler。
查询 handler 时，可以找到所有设备。
动态增删设备和 handler 时，关系可维护。

6. 总结

方法	内核 demo	内核使用场景	特点
bitmap/数组	demo_bitmap	cpumask, irq affinity	简单高效，元素有限
中间表	demo_table	fdtable (task ↔ file)	稀疏关系，方便遍历
链表	demo_list	inode ↔ dentry, input_dev ↔ input_handler	动态关系，灵活
哈希表	demo_hash	页缓存, cgroup ↔ task	大规模关系，快速查找