当前位置：首页 > news >正文

Python的collections引入的类型介绍(Python中的map, unordered_map, struct, 计数器， chainmap)

news 2025/8/21 12:13:04

Python `collections` 模块指南（含 C++ STL 类比）

collections 是 Python 标准库中提供的 高性能容器数据类型集合。
对于熟悉 C++ 的同学，它的很多功能相当于 STL 容器的 Python 版本或变种，尤其在字典、列表、队列等方面有相似之处。

一、概览（Python vs C++）

Python 类型	功能概要	C++ STL 类似数据结构
`namedtuple`	带字段名的不可变元组	`struct`（或 `std::tuple`，加上命名）
`deque`	双端队列，头尾操作 O(1)	`std::deque`
`Counter`	频率计数器（字典子类）	`std::unordered_map<T,int>`
`OrderedDict`	保序字典（插入顺序）	`std::map`（按顺序），或有序 `std::vector<pair>`
`defaultdict`	带默认值的字典	`std::unordered_map` + 自动初始化逻辑
`ChainMap`	多映射视图（按优先顺序查找）	无直接等价，可用多个 `map` + 查找链逻辑

二、核心类型详解（含 C++ 对应）

1. `namedtuple` —— 具名元组

具名元组是 不可变的 数据结构，像 tuple 一样低内存，但可以通过名称访问字段。

from collections import namedtuplePoint = namedtuple("Point", ["x", "y"])
p = Point(3, 4)print(p.x, p.y)  # 3 4
print(p[0], p[1])

C++类似：

最接近于 struct { int x; int y; };
也可视作 std::tuple<int,int> + 一个额外的接口层来用名字访问。

适用场景：

数据对象需要命名字段，且没必要修改（不可变提升安全性和性能）。

2. `deque` —— 双端队列

deque 支持 O(1) 时间复杂度 的头尾插入/删除，比 list 更高效。

from collections import dequedq = deque([1, 2, 3])
dq.appendleft(0)
dq.append(4)
print(dq)  # deque([0, 1, 2, 3, 4])
dq.popleft()

C++类似：

std::deque<int>：内部实现通常为分段数组，头尾插入高效。

适用场景：

队列、栈、滑动窗口、BFS。

3. `Counter` —— 计数器

用于统计可迭代对象中每个元素的出现次数。

from collections import Countercnt = Counter("abracadabra")
print(cnt)  # Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})print(cnt.most_common(2))  # [('a', 5), ('b', 2)]

C++类似：

std::unordered_map<char, int> 用于统计频率。
std::map<char,int> 也能做类似工作，只是插入&查找复杂度 O(log n)。

4. `OrderedDict` —— 有序字典

类似于 std::map，因为 std::map 是有序的（按键排序），但 OrderedDict 是按插入顺序排序。
维护 插入顺序 的映射。
在 Python 3.7+ 普通 dict 也保序，但 OrderedDict 还有额外方法如 move_to_end。

from collections import OrderedDictod = OrderedDict()
od["b"] = 2
od["a"] = 1
od.move_to_end("b")

C++类似：

std::map：按 key 排序存储（API 语义更像平衡树）。
若想按插入顺序，可用 std::vector<pair> 或 boost 的 multi_index_container。

5. `defaultdict` —— 带默认值的字典

访问不存在的键时，自动用工厂函数生成默认值。

from collections import defaultdictd = defaultdict(int)  # 默认为0
d["x"] += 1   # 自动创建 d["x"]=0 再 +1

C++类似：

std::unordered_map<K,V> + 手动写逻辑：

if (m.find(k) == m.end()) m[k] = V(); // 自动初始化

defaultdict(list) 类似：unordered_map<K, vector<V>> 并在访问时自动创建空 vector。

6. `ChainMap` —— 多映射视图

将多个 dict 视为一个逻辑映射，按顺序查找，直到命中。

from collections import ChainMapdefaults = {"lang": "en", "theme": "light"}
user = {"theme": "dark"}cm = ChainMap(user, defaults)
print(cm["theme"])  # dark

C++类似：

没有直接等价的 STL 容器。
可模拟：维护多个 map 引用，查询时按顺序遍历查找。

三、应用场景类比

场景	Python collections	C++ STL 类比
词频统计	`Counter`	`unordered_map<string, int>`
带默认集合/列表分组	`defaultdict(list)`	`unordered_map<K, vector<T>>`
按插入顺序输出映射	`OrderedDict`	`vector<pair<K,V>>` 或 boost
LRU 缓存	`OrderedDict` + `move_to_end`	`list` + `unordered_map`（双向链表法）
BFS队列	`deque`	`std::deque`
多层配置查找	`ChainMap`	手写多 map 顺序查找

四、注意事项（从 C++ 视角）

Python 的字典底层是哈希表，dict ≈ unordered_map（查找平均 O(1)）。
OrderedDict 的顺序不是按 key 排序，而是 插入顺序（不同于 C++ map）。
deque 的索引操作复杂度 O(1)，但非连续内存（C++ std::deque 亦如此）。
namedtuple 是不可变的；要修改用 _replace()：
```
p = p._replace(x=42)
```
Counter 支持加减集合运算，这在 C++ STL 中需要手写。

五、例子：C++ 风格的 Python 词频统计

from collections import Counterdef word_freq(text):words = text.split()freq = Counter(words)for k, v in freq.items():print(f"{k}: {v}")word_freq("a b a c b a")

C++ 中可类比成：

unordered_map<string,int> freq;
for (auto &w : words) freq[w]++;

✅ 总结

如果你熟悉 C++ STL，可以把 dict 看作 unordered_map，OrderedDict 看作 带顺序的字典类，deque 完全等价于 std::deque。
defaultdict 相当于省略了 C++ 中反复写的 if(m.count(key)==0) m[key]=V();。
Counter 就是 unordered_map<T,int> 的封装，并提供了统计排名、集合运算等功能。