当前位置: 首页 > news >正文

轻量xlsx读取库xlsx_drone的编译与测试

这个库是在看其他网页时,作为和功能丰富的xlsxio库的对比来的,按照xlsx_drone github页面介绍,
特征

  • 不使用任何外部应用程序来解析它们。
  • 注重速度而不是功能。
  • 简单的接口。
  • UTF-8 支持。

安装
直接将 src 和 ext 文件夹复制并粘贴到项目根文件夹和源代码中。您可能希望以不同的方式容纳文件是可以理解的,但请注意 xlsx_drone.h 使用相对路径调用其名称:#include “xlsx_drone.h”

// external libraries
#include "../ext/zip.h"
#include "../ext/sxmlc.h"
#include "../ext/sxmlsearch.h"

除此以外,连一个安装步骤都没有写,只给出了几个代码片段,没法直接使用,文档也就让你参考 src/xlsx_drone.h。

还好源代码包中有CMakelists.txt,那就表示能用cmake & make来生成。实际结果确实生成了动态库, 而且几乎瞬间生成。

/# cd par/xlsx_drone
/par/xlsx_drone# mkdir build
/par/xlsx_drone# cd build
/par/xlsx_drone/build# cmake ..
-- The C compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /par/xlsx_drone/build
/par/xlsx_drone/build# make
[ 20%] Building C object CMakeFiles/xlsx_drone.dir/ext/zip.c.o
In file included from /par/xlsx_drone/ext/zip.c:37:
/par/xlsx_drone/ext/miniz.h:5106:9: note: '#pragma message: Using fopen, ftello, fseeko, stat() etc. path for file I/O - this path may not support large files.'5106 | #pragma message(                                                               \|         ^~~~~~~
[ 40%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlc.c.o
[ 60%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlsearch.c.o
[ 80%] Building C object CMakeFiles/xlsx_drone.dir/src/xlsx_drone.c.o
[100%] Linking C shared library libxlsx_drone.so
[100%] Built target xlsx_drone

浏览源码目录树,有个test目录下面有个xlsx_drone.test.c源文件,其中还有个helpers目录,装了几个看上去用来测试的文件。
上手编译它,很多链接找不到引用错误,其中一些像单元测试函数,一些像xlsx_drone中的函数。

/par/xlsx_drone/test# ls
helpers  xlsx_drone.test.c
/par/xlsx_drone/test# gcc  xlsx_drone.test.c -o xlsx_drone
/usr/bin/ld: /tmp/ccqiKV5x.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x4c): undefined reference to `UnityAssertEqualNumber'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x51): undefined reference to `xlsx_get_xlsx_errno'

注意到,此文件开头包含了#include "…/ext/unity.h"语句,而ext目录下除了有unity的头文件,还有c源代码,在Makefile中没有用到,那就加上unity.c一起编译,一下子,单元测试引用的错误全消失了,只留下xlsx类错误。

gcc  xlsx_drone.test.c ../ext/unity.c -o xlsx_drone/usr/bin/ld: /tmp/ccD6n0u6.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'

再加上-L 和-l参数,就链接成功了,把共享动态库目录加入LD_LIBRARY_PATH,就能执行了。执行报错,好像是打开文件失败,

/par/xlsx_drone/test# gcc  xlsx_drone.test.c ../ext/unity.c -o xlsx_drone -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone/test# ./xlsx_drone
./xlsx_drone: error while loading shared libraries: libxlsx_drone.so: cannot open shared object file: No such file or directory/par/xlsx_drone/test# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/par/xlsx_drone/build
/par/xlsx_drone/test# ./xlsx_drone
xlsx_drone.test.c:19:test_xlsx_open:FAIL: Expected 1 Was 0
xlsx_drone.test.c:370:test_xlsx_load_sheet:FAIL: Expected 0 Was -13
Segmentation fault (core dumped)

看xlsx_drone.test.c源码,里面的路径是"test\helpers\non_existent.xlsx"之类,于是再在父目录执行,结果还是报错。最后把Windows的\目录分隔符全都替换成Linux的/,重新编译,执行成功,如下所示;

/par/xlsx_drone# gcc  test/xlsx_drone.test2.c ext/unity.c -o xlsx_drone2 -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xlsx_drone2
test/xlsx_drone.test2.c:16:test_xlsx_open:FAIL: Expected 0 Was 1
test/xlsx_drone.test2.c:1231:test_xlsx_load_sheet:PASS
test/xlsx_drone.test2.c:1232:test_xlsx_unload_sheet:PASS
test/xlsx_drone.test2.c:1233:test_xlsx_get_last_column:PASS
test/xlsx_drone.test2.c:476:test_xlsx_read_cell:FAIL: Unity Double Precision Disabled
test/xlsx_drone.test2.c:1235:test_xlsx_close:PASS-----------------------
6 Tests 2 Failures 0 Ignored
FAIL

其中第一个测试是测不存在的文件,我故意放了一个non_existent.xlsx在helpers,导致期望0的测试执行结果为1,也报错了。

再把首页的代码片段放入一个main函数,并包含头文件,

#include "src/xlsx_drone.h"
int main()
{// open *.xlsxxlsx_workbook_t wb;xlsx_open("test/helpers/sample.xlsx", &wb);// be free to inspect some wb dataint number_of_sheets = wb.n_sheets;// load sheetxlsx_sheet_t *sheet_1 = xlsx_load_sheet(&wb, 1, NULL);// be free to inspect some sheet datachar *sheet_name = sheet_1->name;int last_row = sheet_1->last_row; // valued 0 if the sheet is empty// as of version 0.2.0 you can retrieve the last column (more info below)char *last_column = xlsx_get_last_column(sheet_1); // i.e.: "FB" or "R", etc.// read cellxlsx_cell_t cell_data_holder;    xlsx_read_cell(sheet_1, 4, "B", &cell_data_holder);// inspect resultswitch(cell_data_holder.value_type) {case XLSX_POINTER_TO_CHAR:printf("Cell 4B has value: %s", cell_data_holder.value.pointer_to_char_value);break;case XLSX_INT:printf("Cell 4B has value: %d", cell_data_holder.value.int_value);break;case XLSX_LONG_LONG:printf("Cell 4B has value: %lld", cell_data_holder.value.long_long_value);break;case XLSX_DOUBLE:printf("Cell 4B has value: %f", cell_data_holder.value.double_value);break;default:printf("Cell 4B has no value");}// you can also inspect the cell categoryint cell_category = cell_data_holder.style->related_category;	/*typedef enum xlsx_cell_category {XLSX_NUMBER, // int, long long, or doubleXLSX_TEXT, // stringXLSX_DATE, // intXLSX_TIME, // doubleXLSX_DATE_TIME, // doubleXLSX_UNKNOWN} xlsx_cell_category;*/char cat[7][10]={"NUMBER", "TEXT", "DATE", "TIME", "DATE_TIME", "UNKNOWN"};printf("Cell 4B has category XLSX_%s\n", cat[cell_category]);// when you're done reading the XLSX, close the workbook to properly free resourcesxlsx_close(&wb);	
}

编译运行结果如下

gcc example.c -o example -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./example
Cell 4B has value: -1000Cell 4B has category XLSX_NUMBER

再让DeepSeek照样编写一个把xlsx转为csv的程序。

#include "src/xlsx_drone.h"
#include <stdio.h> // 添加标准IO头文件用于CSV操作int main()
{// 打开 *.xlsx 文件xlsx_workbook_t wb;xlsx_open("test/helpers/sample.xlsx", &wb);// 加载第一个工作表xlsx_sheet_t *sheet = xlsx_load_sheet(&wb, 1, NULL);// 创建并打开CSV文件用于写入FILE *csv_file = fopen("output.csv", "w");if (csv_file == NULL) {printf("无法创建CSV文件\n");xlsx_close(&wb);return 1;}// 获取工作表尺寸信息int last_row = sheet->last_row;char *last_column = xlsx_get_last_column(sheet);// 遍历所有行和列for (int row = 1; row <= last_row; row++) {for (char col_char = 'A'; col_char <= last_column[0]; col_char++) {// 处理多字母列名(如AA, AB等)char col_str[3] = {col_char, '\0'};if (last_column[1] != '\0') {// 如果需要处理两字母列名,这里需要扩展逻辑}// 读取单元格数据xlsx_cell_t cell;xlsx_read_cell(sheet, row, col_str, &cell);// 根据数据类型写入CSVswitch(cell.value_type) {case XLSX_POINTER_TO_CHAR:fprintf(csv_file, "\"%s\"", cell.value.pointer_to_char_value);break;case XLSX_INT:fprintf(csv_file, "%d", cell.value.int_value);break;case XLSX_LONG_LONG:fprintf(csv_file, "%lld", cell.value.long_long_value);break;case XLSX_DOUBLE:fprintf(csv_file, "%f", cell.value.double_value);break;default:// 空单元格留空break;}// 添加逗号分隔符(最后一列后不加逗号)if (col_char < last_column[0]) {fprintf(csv_file, ",");}}// 换行表示新的一行fprintf(csv_file, "\n");}// 关闭CSV文件fclose(csv_file);// 关闭工作簿并释放资源xlsx_close(&wb);printf("XLSX文件已成功转换为CSV格式\n");return 0;
}

编译执行成功,前6行的内容如下,可见它支持UTF-8, 包括汉字希腊文。

gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xls2csv
bash: ./xls2csv: No such file or directory
/par/xlsx_drone# ./xlsx2csv
XLSX文件已成功转换为CSV格式head -6 output.csv
"General","Number","Currency","Accounting","Date","Time","Percentage","Fraction (1.5)","Scientific (0.001)","Text","Special","Custom"
"Foo",1000,1000,147,43458,0.104792,0.500000,1.500000,0.001000,"1875",2000,12
235,1000,-14562.740000,1200.874000,43458,0.104792,0.450000,1.500000,0.001000,"Just text",2000,40955
17.890000,-1000,584,,43458,0.104792,1.600000,1.500000,0.001000,"𐐀34",543415635644,
,-1000,,,43458,0.104792,,1.500000,,"foo你bar好qaz",34580585,
,1200.561000,,,43458,0.104792,,1.500000,,,,

再用这个代码读取8*1M行的xlsx文件,看它的性能。

gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone -O3
root@6ae32a5ffcde:/par/xlsx_drone# time ./xlsx2csv
XLSX文件已成功转换为CSV格式real    0m14.381s
user    0m10.321s
sys     0m3.695s

尽管用了-O3编译,还是不太快。

http://www.dtcms.com/a/358963.html

相关文章:

  • 关联容器(Associative containers)
  • Windows server 2012安装步骤
  • 为什么要用 Markdown?以及如何使用它
  • 软考备考(5)
  • vite Rendering 10 pagesReferenceError: document is not defined
  • 35.Ansible的yaml语法与playbook的写法
  • Java基础知识(十)
  • 【软考架构】面向服务的体系结构(SOA)深度解析
  • 记IsaacSim中抓取物体的质量物理属性的影响
  • 【智能体】零代码学习 Coze 智能体(2)创建智能体的完整步骤
  • 直线与椭圆相交弦长计算公式
  • K8s服务日志收集方案文档
  • LeetCode 92. 反转链表 II - 算法解析
  • 微服务-ruoyi-cloud部署
  • 《动手学深度学习v2》学习笔记 | 2.3 线性代数
  • 吴恩达机器学习(三)
  • 第7.5节:awk语言 switch 语句
  • 【C++】掌握string类操作:高效处理字符串
  • k230 按键拍照后,将摄像头拍照的1920*1080分辨率的图片以jpg文件格式,保存到板载TF存储卡的指定文件夹目录中
  • MinerU本地化部署
  • Java的Optional实现优雅判空新体验【最佳实践】
  • 做一个实用的节假日工具
  • MQTT 连接建立与断开流程详解(一)
  • sunset: decoy靶场渗透
  • 20250830_Oracle 19c CDB+PDB(QMS)默认表空间、临时表空间、归档日志、闪回恢复区巡检手册
  • day42-Ansible
  • 动态规划--Day05--最大子数组和--53. 最大子数组和,2606. 找到最大开销的子字符串,1749. 任意子数组和的绝对值的最大值
  • 微信小程序开发教程(三)
  • java如何保证线程安全
  • RLPD——利用离线数据实现高效的在线RL:不进行离线RL预训练,直接应用离策略方法SAC,在线学习时对称采样离线数据