编译Duckdb机器学习插件QuackML
从存储库下载源代码,并解压到/par
首先用duckdb 1.3的源代码编译,报错,头文件不存在
export LD_LIBRARY_PATH=/par/duck/build/srcg++ -fPIC -shared -o libtest2.so *.cpp -I /par/duck/src/include -lssl -lcrypto -I include -lduckdb -L /par/duck/build/src
In file included from quackml_extension.cpp:12:
include/functions/sum_count.hpp:11:10: fatal error: duckdb/core_functions/aggregate/nested_functions.hpp: No such file or directory11 | #include "duckdb/core_functions/aggregate/nested_functions.hpp"
查看这个软件发布的日期,2024年4月,找到相应时间的DuckDB版本,下载源代码 和libduckdb库。解压缩头文件到/par/duckdb-0.10.3/include, 库文件到/par/duckdb-0.10.3/lib,
g++ -fPIC -shared -o libtest2.so *.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib
In file included from /par/duckdb-0.10.3/include/duckdb/common/multi_file_reader_options.hpp:13,from /par/duckdb-0.10.3/include/duckdb/execution/operator/csv_scanner/csv_reader_options.hpp:19,from /par/duckdb-0.10.3/include/duckdb/common/serializer/deserializer.hpp:18,from /par/duckdb-0.10.3/include/duckdb/main/secret/secret.hpp:13,from /par/duckdb-0.10.3/include/duckdb/main/extension_util.hpp:14,from quackml_extension.cpp:8:
/par/duckdb-0.10.3/include/duckdb/common/hive_partitioning.hpp:28:82: error: 'duckdb_re2' has not been declared28 | DUCKDB_API static std::map<string, string> Parse(const string &filename, duckdb_re2::RE2 ®ex);| ^~~~~~~~~~
搜索了一下,duckdb_re2是在第3方目录下的re2中定义的,解压缩到/par,单独编译,报错,删除不识别的命令disable_target_warnings,生成了makefile, 可以生成了。
cd /par/re2
root@6ae32a5ffcde:/par/re2# mkdir build
root@6ae32a5ffcde:/par/re2# cd build
root@6ae32a5ffcde:/par/re2/build# cmake ..
-- The CXX compiler identification is GNU 14.2.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/local/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:104 (disable_target_warnings):Unknown CMake command "disable_target_warnings".-- Configuring incomplete, errors occurred!
See also "/par/re2/build/CMakeFiles/CMakeOutput.log".
root@6ae32a5ffcde:/par/re2/build# cmake ..
-- Configuring done
-- Generating done
-- Build files have been written to: /par/re2/build
root@6ae32a5ffcde:/par/re2/build# make
[ 4%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/bitmap256.cc.o
[ 8%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/compile.cc.o
[ 12%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/bitstate.cc.o
[ 16%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/dfa.cc.o
[ 20%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/filtered_re2.cc.o
[ 25%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/mimics_pcre.cc.o
[ 29%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/nfa.cc.o
[ 33%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/onepass.cc.o
[ 37%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/parse.cc.o
[ 41%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/perl_groups.cc.o
[ 45%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prefilter.cc.o
[ 50%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prefilter_tree.cc.o
[ 54%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/prog.cc.o
[ 58%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/re2.cc.o
[ 62%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/regexp.cc.o
[ 66%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/set.cc.o
[ 70%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/simplify.cc.o
[ 75%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/stringpiece.cc.o
[ 79%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/tostring.cc.o
[ 83%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/unicode_casefold.cc.o
[ 87%] Building CXX object CMakeFiles/duckdb_re2.dir/re2/unicode_groups.cc.o
[ 91%] Building CXX object CMakeFiles/duckdb_re2.dir/util/rune.cc.o
[ 95%] Building CXX object CMakeFiles/duckdb_re2.dir/util/strutil.cc.o
[100%] Linking CXX static library libduckdb_re2.a
[100%] Built target duckdb_re2
把上述路径加入-I , 然后在quackml_extension.cpp中添加#include “re2.h”, 仍然报错
g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/re2
In file included from quackml_extension.cpp:5:
/par/re2/re2/re2.h:279:13: error: 'StringPiece' does not name a type
...
/usr/include/re2/stringpiece.h:34:7: note: 're2::StringPiece' declared here34 | class StringPiece {| ^~~~~~~~~~~
这里怎么出现了一个/usr/include/re2/目录下的头文件?可能因为系统预装的re2,而系统预装的头文件不能被自动包含进去。
再看/par/re2/re2/re2.h中确实引用了re2/stringpiece.h, 那就是我们的-I 目录写错了,把#include "re2.h"改为#include “re2/re2.h”,-I 改为/par/re2,re2相关的错误没有了。还剩余一个函数参数个数不对错误。
g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp -I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/
functions/linear_reg.cpp: In function 'void quackml::SlowLinearRegressionFinalize(duckdb::Vector&, duckdb::AggregateInputData&, duckdb::Vector&, idx_t, idx_t)':
functions/linear_reg.cpp:273:42: error: too many arguments to function 'std::vector<std::vector<double> > quackml::getGradientND(std::vector<std::vector<double> >&, std::vector<std::vector<double> >&, std::vector<std::vector<double> >&, double)'273 | auto gradient = getGradientND(*state.sigma, *state.c, *state.theta, state.lambda, state.count);| ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from functions/linear_reg.cpp:5:
include/functions/linear_reg_utils.hpp:21:38: note: declared here21 | std::vector<std::vector<double>> getGradientND(std::vector<std::vector<double>> &sigma, std::vector<std::vector<double>> &c, std::vector<std::vector<double>> &theta, double lambda);
把linear_reg.cpp中的最后一个参数删除,编译通过。
虽然编译成功,在将它用python3 ./appendmetadata.py -l libtest2.so -n quackml -dv v1.3.0 --duckdb-platform linux_amd64 --extension-version 0.1 --abi-type ""转成插件后,总是报找不到符号错误,
/par/duckdb130 -unsigned
DuckDB v1.3.0 (Ossivalis) 71c5c07cdd
Enter ".help" for usage hints.
D load '/par/QuackML-main/src/quackml.duckdb_extension';
IO Error:
Extension "/par/QuackML-main/src/quackml.duckdb_extension" could not be loaded: /par/QuackML-main/src/quackml.duckdb_extension: undefined symbol: _ZN6duckdb18BaseScalarFunctionC2ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_6vectorINS_11LogicalTypeELb1EEES8_NS_17FunctionStabilityES8_NS_20FunctionNullHandlingEexport LD_LIBRARY_PATH=/par/duckdb-0.10.3/lib
root@6ae32a5ffcde:/par/QuackML-main/src# g++ -fPIC -shared -o libtest2.so *.cpp */*.cpp /par/re2/build/libduckdb_re2.a -
I /par/duckdb-0.10.3/include -lssl -lcrypto -I include -lduckdb -L /par/duckdb-0.10.3/lib -I /par/re2/
D load '/par/QuackML-main/src/quackml.duckdb_extension';
IO Error:
Extension "/par/QuackML-main/src/quackml.duckdb_extension" could not be loaded: /par/QuackML-main/src/quackml.duckdb_extension: undefined symbol: _ZNK6duckdb18BaseScalarFunction8ToStringB5cxx11Ev
D .exit
用了各种版本库文件和CLI,包括用0.10.3的duckdb CLI来调用,都未解决,还需要进一步研究