当前位置：首页 > news >正文

KuiperInfer跟学第二课——张量的构建与实现

news 来源：原创 2025/6/7 15:54:48

KuiperInfer跟学——张量的构建与实现

一、行主序、列主序

首先明白张量的定义和存储方式，张量Tensor的构造方式是按照（channel, rows, cols）的形式，而这里用于存储张量的数据类型为Armadillo库中的fcube类，也就是Cube的typedef；cube类是一个三维数据，构造方式为(rows, cols, channel)，但是cube类在内存中是按照列主序的方式存储的。这里的列优先存储是什么意思呢，就是cube虽然是三维数据（这里的三维其实是对内存存储方式的一种抽象表达），存储在计算机内存中是线性存储的，而不是矩阵的形式。以一个rows=3, cols=3, channel=1的cube举例；

在这里插入图片描述

Pytorch中的Tensor是按照行主序的方式进行存储的：

在这里插入图片描述

二、自定义Tensor类的实现

思路：根据面对对象的思想，Tensor的成员数据应该包括数据和形状，这里用cube类型存储数据，用vector类存储形状；首先对于构造函数，有用形状构建的，有用值构建的，也有同时形状和值构建的情况，Tensor应该涵盖全面的基础操作，比如形状变换、值的转换、定位操作、切片操作、赋值、边缘填充等

//
// tensor.hpp
// tensor头文件#ifndef KUIPER_COURSE_DATA_BLOB_HPP_
#define KUIPER_COURSE_DATA_BLOB_HPP_
#include <memory>
#include <vector>
#include "armadillo"namespace kuiper_infer {
template<typename T>
class Tensor {};template<>
class Tensor<uint8_t> {// 待实现
};template<>
class Tensor<float> {public:explicit Tensor() = default;/*** 创建张量* @param channels 张量的通道数* @param rows 张量的行数* @param cols 张量的列数*/explicit Tensor(uint32_t channels, uint32_t rows, uint32_t cols);explicit Tensor(const std::vector<uint32_t> &shapes);static std::shared_ptr<Tensor<float>> Create(uint32_t channels, uint32_t rows, uint32_t cols);Tensor(const Tensor &tensor);Tensor(Tensor &&tensor) noexcept;Tensor<float> &operator=(Tensor &&tensor) noexcept;Tensor<float> &operator=(const Tensor &tensor);/*** 返回张量的行数* @return 张量的行数*/uint32_t rows() const;/*** 返回张量的列数* @return 张量的列数*/uint32_t cols() const;/*** 返回张量的通道数* @return 张量的通道数*/uint32_t channels() const;/*** 返回张量中元素的数量* @return 张量的元素数量*/uint32_t size() const;/*** 设置张量中的具体数据* @param data 数据*/void set_data(const arma::fcube &data);/*** 返回张量是否为空* @return 张量是否为空*/bool empty() const;/*** 返回张量中offset位置的元素* @param offset 需要访问的位置* @return offset位置的元素*/float index(uint32_t offset) const;/*** 返回张量中offset位置的元素* @param offset 需要访问的位置* @return offset位置的元素*/float &index(uint32_t offset);/*** 张量的尺寸大小* @return 张量的尺寸大小*/std::vector<uint32_t> shapes() const;/*** 张量的实际尺寸大小* @return 张量的实际尺寸大小*/const std::vector<uint32_t> &raw_shapes() const;/*** 返回张量中的数据* @return 张量中的数据*/arma::fcube &data();/*** 返回张量中的数据* @return 张量中的数据*/const arma::fcube &data() const;/*** 返回张量第channel通道中的数据* @param channel 需要返回的通道* @return 返回的通道*/arma::fmat &at(uint32_t channel);/*** 返回张量第channel通道中的数据* @param channel 需要返回的通道* @return 返回的通道*/const arma::fmat &at(uint32_t channel) const;/*** 返回特定位置的元素* @param channel 通道* @param row 行数* @param col 列数* @return 特定位置的元素*/float at(uint32_t channel, uint32_t row, uint32_t col) const;/*** 返回特定位置的元素* @param channel 通道* @param row 行数* @param col 列数* @return 特定位置的元素*/float &at(uint32_t channel, uint32_t row, uint32_t col);/*** 填充张量* @param pads 填充张量的尺寸* @param padding_value 填充张量*/void Padding(const std::vector<uint32_t> &pads, float padding_value);/*** 使用value值去初始化向量* @param value*/void Fill(float value);/*** 使用values中的数据初始化张量* @param values 用来初始化张量的数据*/void Fill(const std::vector<float> &values);/*** 以常量1初始化张量*/void Ones();/*** 以随机值初始化张量*/void Rand();/*** 打印张量*/void Show();/*** 张量的实际尺寸大小的Reshape* @param shapes 张量的实际尺寸大小*/void ReRawshape(const std::vector<uint32_t> &shapes);/*** 张量的实际尺寸大小的Reshape pytorch兼容* @param shapes 张量的实际尺寸大小*/void ReRawView(const std::vector<uint32_t> &shapes);/*** 张量相加* @param tensor1 输入张量1* @param tensor2 输入张量2* @return 张量相加的结果*/static std::shared_ptr<Tensor<float>> ElementAdd(const std::shared_ptr<Tensor<float>> &tensor1,const std::shared_ptr<Tensor<float>> &tensor2);/*** 张量相乘* @param tensor1 输入张量1* @param tensor2 输入张量2* @return 张量相乘的结果*/static std::shared_ptr<Tensor<float>> ElementMultiply(const std::shared_ptr<Tensor<float>> &tensor1,const std::shared_ptr<Tensor<float>> &tensor2);/*** 展开张量*/void Flatten();/*** 对张量中的元素进行过滤* @param filter 过滤函数*/void Transform(const std::function<float(float)> &filter);/*** 返回一个深拷贝后的张量* @return 新的张量*/std::shared_ptr<Tensor> Clone();const float *raw_ptr() const;private:void ReView(const std::vector<uint32_t> &shapes);std::vector<uint32_t> raw_shapes_; // 张量数据的实际尺寸大小arma::fcube data_; // 张量数据
};using ftensor = Tensor<float>;
using sftensor = std::shared_ptr<Tensor<float>>;}#endif //KUIPER_COURSE_DATA_BLOB_HPP_

虽然Cube类型是列主序，想要的Tensor是行主序，但是我们这里不需要把Tensor的行对应着cube的列。我们只要保证在二者的抽象层面的维度对齐就行，然后在用Tensor存数据的时候，注意cube会默认按照列主序进行存储，为了防止我们明明想要把数据放在Tensor的行中，结果却不小心存到了Tensor的列中的情况，在实现赋值类的成员函数的时候要注意到这一点。

其中值得注意的是Fill成员函数

void Tensor<float>::Fill(const std::vector<float>& values, bool row_major) {CHECK(!this->data_.empty());const uint32_t total_elems = this->data_.size();CHECK_EQ(values.size(),  total_elems);if (row_major) {const uint32_t rows = this->rows();const uint32_t cols = this->cols();const uint32_t planes = rows * cols;const uint32_t channels = this->data_.n_slices;for (uint32_t i = 0; i < channels; ++i) {auto& channel_data = this->data_.slice(i);const arma::fmat& channel_data_t =arma::fmat(values.data() + i * planes, this->cols(), this->rows());channel_data = channel_data_t.t();}} else {std::copy(values.begin(), values.end(), this->data_.memptr());}
}

这个函数用于把一个vector连续数据的值赋给一个Tensor数据。其中row_major用于指定行主序还是列主序；

当row_major=true的时候，假如还是按照fmat的默认构造方式，会把value的数据按照列主序的方式填充到fmat中，也就是先填充列方向的位置，再填充行方向的位置。所以这里采用转置的方式，使得value数据赋值到Tensor中正确的位置（这里的位置是指张量的位置，而不是内存的位置）；

Tensor的reshape和Flatten等改变形状的操作，其实内存数据并没有改变，改变的只是Tensor对于这段数据的解释

张量在底层是一个连续的内存块（数组），这些操作只是改变了“维度信息”，并没有修改或移动内存中的数据。

三、成员函数解析

这里写了一下对其中几个成员函数的理解。

void Tensor<float>::Transform(const std::function<float(float)> &filter) {CHECK(!this->data_.empty());uint32_t channels = this->channels();for (uint32_t c = 0; c < channels; ++c) {this->data_.slice(c).transform(filter);}
}

Transform函数用于对Tensor中每个元素进行变换操作，需要用到C++中函数指针的知识。这里使用std::function<float(float)>表明传入的函数的参数的个数为1个，参数类型为float，返回类型为float;

这里调用了fmat类的transform成员函数。我们在设计这些函数的时候，也需要取查看底层数据类型相关的函数，看能不能进行复用，减少自己的编码实现，提高整体的封装性。

void Tensor<float>::Fill(const std::vector<float>& values, bool row_major) {CHECK(!this->data_.empty());const uint32_t total_elems = this->data_.size();CHECK_EQ(values.size(),  total_elems);if (row_major) {const uint32_t rows = this->rows();const uint32_t cols = this->cols();const uint32_t planes = rows * cols;const uint32_t channels = this->data_.n_slices;for (uint32_t i = 0; i < channels; ++i) {auto& channel_data = this->data_.slice(i);const arma::fmat& channel_data_t =arma::fmat(values.data() + i * planes, this->cols(), this->rows());channel_data = channel_data_t.t();}} else {std::copy(values.begin(), values.end(), this->data_.memptr());}
}

这个Fill函数需要传入一个vector向量，和期望的填充方式，row_major=true的时候，会按照行主序的方式进行填充，也就是先填充行，再填充列，最后填充通道。row_major=false的时候，会按照列主序的方式进行填充，也就是先填充列，再填充行、最后填充通道。fmat默认的填充方式为列主序，所以这里先定义一个临时矩阵，然后按照默认的列主序进行填充，再进行转置，就变成了行主序的方式。

四、课后作业

1、实现Tensor类的Flatten函数

void Tensor<float>::Flatten(bool row_major) {CHECK(!this->data_.empty());// 请补充代码// 当row_major=true的时候，表示按照行主序的方式把Tensor中的数据展平成向量// 当row_major=false的时候，表示按照列主序的方式把Tensor中的数据展平成一个向量// 如果row_major=falsed的话，那么直接使用cube默认的reshape就可以直接按列主序的方式进行展平this->data_.reshape(1, this->size(), 1);this->raw_shapes_={this->size()};if(row_major){// 按照行主序的方式，由于data_是cube类型，是列主序的方式，所以先进行转置一下，再按照自身读取方式（列主序）读取，这样就相当于行主序了// 具体步骤就是先得到一个正确顺序的vector，然后改变this->data的形状为1维，再把刚才的vector的值赋给this->data;std::vector<float> value(this->size());int plane = this->data_.n_rows*this->data_.n_cols;for(int i=0;i<(this->data_.n_slices);++i){const arma::fmat & channel = this->data_.slice(i).t(); // 读取每个通道矩阵的转置std::copy(channel.begin(), channel.end(), value.begin()+plane*i);}// 如果row_major=true的话，就需要对默认方式展平的cube进行填充this->Fill(value, row_major);}
}

通过写这个函数，自己加深了对于列主序、行主序的理解，以及如何根据不同的数据排布方式，进行代码的编写，比如这里Flatten的时候，用于可以指定自己想要的展平方式，当row_major=true的时候，按照行主序的方式进行展平，但是内置的fcube类默认按照列主序的方式进行展平，如果直接调用fcube的reshape方式，最终会得到列主序的展平，不符合预期，所以这里应该先将每个slice的fmat进行转置，再按照默认的读取方式进行读取，存放到一个vector中，得到一个按照行主序方法得到的vector向量，然后再把这个向量放入到展平后的fcube中。记得同时还有改变成员函数row_shape_的值。

2、编写Tensor::Padding函数，这个函数是在多维张量的周围做填

void Tensor<float>::Padding(const std::vector<uint32_t>& pads,float padding_value) {CHECK(!this->data_.empty());CHECK_EQ(pads.size(), 4);// 四周填充的维度uint32_t pad_rows1 = pads.at(0);  // upuint32_t pad_rows2 = pads.at(1);  // bottomuint32_t pad_cols1 = pads.at(2);  // leftuint32_t pad_cols2 = pads.at(3);  // right// 请补充代码arma::fcube new_data(pad_rows1+pad_rows2+this->data_.n_rows, pad_cols1+pad_cols2+this->data_.n_cols, this->data_.n_slices);new_data.fill(padding_value);new_data.subcube(pad_rows1, pad_cols1, 0,pad_rows1+this->data_.n_rows-1, pad_cols1+this->data_.n_cols-1,this->data_.n_slices-1)=this->data_;this->data_ = std::move(new_data);
}

这里面的std::movie，可以提高性能，是C++标准库中的移动语义，可以传入一个右值或者左值，直接将this->data_的指针直接指向new_data。假如使用常规赋值，this->data_=new_data的话，由于二者的形状不一样，无法复用内存，于是会先销毁this->data原有的内存，然后新建内存，然后将new_data的值赋予进去；