C++ STL:阅读vector类源码|vector类模拟实现(共22小节)|附源码|不安全的位拷贝示例
上一篇文章:
https://blog.csdn.net/2401_86123468/article/details/154180446?spm=1001.2014.3001.5501
本文代码和vector源码(下载即可):
https://gitee.com/jxxx404/cpp-language-learning/commit/b0af8cc4c0581a3a879f4cb152b652499f2f6157
为方便读代码,推荐此软件:
https://www.sourceinsight.com/
推荐书籍:《STL源码剖析》
1.阅读vector源代码
此源代码较久远,相对易于理解。现在接触较早,只简单的看看库中的大框架以及学习如何看代码即可。

stl_construct.h
显式调析构

construct定义new

vector.h
核心成员变量:有三个迭代器

start是指向开始的指针
finish是指向数据结束的直至
end_of_storage是空间的结束位置
迭代器:
此处的迭代器就是由原生指针定义

核心成员函数
1.构造函数
![]()
start指向一段数据的开始位置,finish指向最后一个数据的下一个位置,end_of_storage指向结束空间位置的下一个位置
怎么验证我上述的判断呢?通过迭代器和size,capacity查看:
2.push_back

2.1insert_aux
上述源码在空间足够时尾插,不够时调用insert_aux:

insert也会调用insert_aux

讲解insert_aux中下述代码:
construct(finish, *(finish - 1));++finish;T x_copy = x;copy_backward(position, finish - 2, finish - 1);*position = x_copy;

2.vector类模拟实现
2.1初始化
模板不能声明与定义分离,分离会导致链接错误。
namespace xxx
{template<class T>class vector{public://typedef T* iterator;下行代码功能更强using iterator = T*;vector():_start(nullptr),_finish(nullptr),_end_of_storage(nullptr){ }private:iterator _start;iterator _finish;iterator _end_of_storage;};
}
2.2capacity
size_t capacity() const
{return _end_of_storage - _start;
}
2.3size
size_t size() const
{return _finish - _start;
}
2.4reserve
本章最后一小节有优化。
void reserve(size_t n)
{if (n > capacity()){size_t sz = size();T* tmp = new T[n];if (_start){memcpy(tmp, _start, sizeof(T) * sz);delete[] _start;}_start = tmp;_finish = _start + sz;_end_of_storage = _start + n;}
}
2.5push_back
void push_back(const T& x)
{//满if (_finish == _end_of_storage){reserve(capacity() == 0 ? 4 : capacity() * 2);}*_finish = x;++_finish;
}
2.6基础迭代器
using iterator = T*;
using const_iterator = const T*;iterator begin()
{return _start;
}iterator end()
{return _finish;
}const_iterator begin() const
{return _start;
}const_iterator end() const
{return _finish;
}
2.7测试
namespace xxx
{void test_vector1(){xxx::vector<int> v;v.push_back(1);v.push_back(2);v.push_back(3);v.push_back(4);v.push_back(5);for (auto e : v){cout << e << " ";}cout << endl;}
}int main()
{xxx::test_vector1();return 0;
}
2.8Print
void Print(const vector<int>& v)
{for (auto e : v){cout << e << " ";}cout << endl;
}void test_vector1()
{xxx::vector<int> v;v.push_back(1);v.push_back(2);v.push_back(3);v.push_back(4);v.push_back(5);Print(v);
}
2.9operator[]
vector.h
T& operator[](size_t i)
{assert(i < size());return _start[i];
}const T& operator[](size_t i) const
{assert(i < size());return _start[i];
}
Test.cpp
void Print(const vector<int>& v)
{for (auto e : v){cout << e << " ";}cout << endl;for (size_t i = 0; i < v.size(); i++){cout << v[i] << " ";}cout << endl;
}void test_vector1()
{xxx::vector<int> v;v.push_back(1);v.push_back(2);v.push_back(3);v.push_back(4);v.push_back(5);v[0]++;Print(v);
}
2.10析构
~vector()
{if (_start){delete[] _start;_start = _finish = _end_of_storage = nullptr;}
}
2.11empty
bool empty() const
{return _start == _finish;
}
2.12pop_back
void pop_back()
{assert(!empty());--_finish;
}
2.13insert
注意:此处迭代器失效的解决办法,不能是为pos添加&,因为库中的函数声明没有这样做,模拟实现会依据C++标准完成。
void insert(iterator pos, const T& x)
{assert(pos >= _start);assert(pos <= _finish);//扩容if (_finish == _end_of_storage){size_t len = pos - _start;reserve(capacity() == 0 ? 4 : capacity() * 2);pos = _start + len;}//挪动数据iterator end = _finish - 1;while (end >= pos){*(end + 1) = *end;--end;}*pos = x;++_finish;
}
测试,此处有迭代器失效的风险:在可能引起扩容的操作后,永远不要继续使用之前的迭代器。
xxx::vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
//v.push_back(5);
Print(v);v.insert(v.begin(), 0);
Print(v);auto it = v.begin() + 3;
//insert以后,it失效
v.insert(it, 6);
Print(v);
2.13.1迭代器失效的原因
1. 内存地址完全改变
std::vector<int> v = {1, 2, 3, 4};
auto it = v.begin() + 2; // it指向元素3的内存地址// 假设当前内存布局:
// 地址0x1000: [1] 地址0x1004: [2] 地址0x1008: [3] 地址0x100C: [4]
// it指向地址0x1008v.push_back(5); // 触发扩容// 扩容后,所有元素被复制到新的内存区域
// 新地址0x2000: [1] 0x2004: [2] 0x2008: [3] 0x200C: [4] 0x2010: [5]
// it仍然指向0x1008,但那里已经不是vector的数据了!2. 迭代器的本质是指针的抽象
// vector迭代器通常实现为原始指针
typedef T* iterator;// 你的it本质上是一个指针
int* it = &v[2]; // 指向具体的内存地址// 扩容后,原来的内存被释放,it变成悬空指针
2.14erase
vector.h
void erase(iterator pos)
{assert(pos >= _start);assert(pos < _finish);iterator it = pos + 1;while (it != _finish){*(it - 1) = *it;++it;}--_finish;
}
测试:此处依旧会产生it失效
xxx::vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);
Print(v);v.erase(v.begin());
Print(v);auto it = v.begin() + 3;
v.erase(it);
Print(v);
2.14.1探寻库中的迭代器失效
Test.cpp
void test_vector4()
{std::vector<int> v;v.push_back(1);v.push_back(2);v.push_back(2);v.push_back(3);v.push_back(4);v.push_back(5);v.push_back(6);// Print(v);for (auto e : v){cout << e << " ";}cout << endl;// 删除所有的偶数auto it = v.begin();while (it != v.end()){if (*it % 2 == 0){v.erase(it);}++it;}for (auto e : v){cout << e << " ";}cout << endl;
}
vs严格检查,只要erase之后的迭代器对象失效,就不能访问。
![]()
2.14.2迭代器失效解决办法
在标准库中这样描述:

所以我们的测试代码应该改为:
void test_vector4()
{std::vector<int> v;v.push_back(1);v.push_back(2);v.push_back(3);v.push_back(4);v.push_back(5);v.push_back(6);for (auto e : v){cout << e << " ";}cout << endl;auto it = v.begin();while (it != v.end()){if (*it % 2 == 0){it = v.erase(it);}else{++it;}}for (auto e : v){cout << e << " ";}cout << endl;
}
因此,修改我们原本定义的erase代码:
iterator erase(iterator pos)
{assert(pos >= _start);assert(pos < _finish);iterator it = pos + 1;while (it != _finish){*(it - 1) = *it;++it;}--_finish;return pos;
}
2.15resize
此处的接口和string中的有些不同:


vector.h
void resize(size_t n, T val = T())
{if (n < size()){//删除数据_finish = _start + n;}else{reserve(n);while (_finish < _start + n){*_finish = val;++_finish;}}
}
Test.cpp
xxx::vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);
v.push_back(6);
for (auto e : v)
{cout << e << " ";
}
cout << endl;v.resize(3);for (auto e : v)
{cout << e << " ";
}
cout << endl;// v.resize(20);
v.resize(20, 5);
for (auto e : v)
{cout << e << " ";
}
cout << endl;
2.16initializer_list初始化列表构造函数
vector.h
vector(initializer_list<T> il)
{reserve(il.size());for (const auto& e : il){push_back(e);}
}
Test.cpp
//xxx::vector<int> v3({ 10,20,30,40 });
xxx::vector<int> v3 = { 10,20,30,40 };
v1 = v3;
for (auto e : v1)
{cout << e << " ";
}
cout << endl;
// 传统方式
vector<int> v;
v.push_back(1);
v.push_back(2);
v.push_back(3);// 现代方式(更简洁、更直观)
vector<int> v = {1, 2, 3};
2.17clear
void clear()
{_finish = _start;
}
2.18 迭代器范围构造函数
// 函数模板,迭代器不一定是vector迭代器,也可以是其他容器的迭代器
template <class InputIterator>
vector(InputIterator first, InputIterator last)
{// reserve(last - first);while (first != last){push_back(*first);//int不能解引用++first;}
}
2.19swap
void swap(vector<T>& v)
{std::swap(_start, v._start);std::swap(_finish, v._finish);std::swap(_end_of_storage, v._end_of_storage);
}
2.20深拷贝
2.20.1拷贝构造
vector.h
vector()//会走缺省值
{ }//拷贝构造传统写法
//v2(v1)
vector(const vector<T>& v)
{reserve(v.capacity());for (const auto& e : v){push_back(e);}
}private:iterator _start = nullptr;iterator _finish = nullptr;iterator _end_of_storage = nullptr;
2.20.2拷贝赋值运算符
vector.h
//拷贝赋值运算符传统写法
//v0 = v1 = v3
//v1 = v3
vector<T>& operator=(const vector<T>& v)
{if (this != &v){clear();reserve(v.capacity());for (const auto& e : v){push_back(e);}}return *this;
}
2.20.3现代写法
//拷贝构造函数现代写法
vector(const vector<T>& v)
{vector<T> tmp(v.begin(), v.end());swap(tmp);
}//赋值运算现代写法
//v1 = v3
vector<T>& operator=(vector<T> tmp)
{swap(tmp);return *this;
}
2.21n个val初始化
vector(size_t n, const T& val = T())
{reserve(n);for (size_t i = 0; i < n; i++){push_back(val);}
}
vector(int n, const T& val = T())
{reserve(n);for (int i = 0; i < n; i++){push_back(val);}
}
Test.cpp
xxx::vector<int> v4(10u, 1);
for (auto e : v4)
{cout << e << " ";
}
cout << endl;
xxx::vector<int> v5(10, 1);
for (auto e : v5)
{cout << e << " ";
}
cout << endl;
xxx::vector<char> v6(10, 'x');
for (auto e : v6)
{cout << e << " ";
}
cout << endl;
2.22reserve中类型不安全的位拷贝
此处是memcpy的错误:

由于插入的是string类型,而string通常包含指向堆内存的指针,memcpy只复制指针值,不复制指向的内容,当原内存被释放时,新对象中的指针变为悬空指针,访问这些字符串时会出现未定义行为。

因此,应该使用元素的拷贝构造函数而不是memcpy,不过更优的选择是交换:
void reserve(size_t n)
{if (n > capacity()){size_t sz = size();//提前保存T* tmp = new T[n];if (_start){for (size_t i = 0; i < sz; i++){//tmp[i] = _start[i]; // 如果是string,调用string的赋值深拷贝std::swap(tmp[i], _start[i]); // 如果是string,调用string的交换,交换资源指向}//memcpy(tmp, _start, sizeof(T) * sz);delete[] _start;}_start = tmp;_finish = _start + sz;_end_of_storage = _start + n;}
}
本章完。
