当前位置: 首页 > news >正文

Fluent Bit系列:字符集转码测试(下)

#作者:程宏斌

文章目录

    • fluent-bit 1.9.4 转换测试
  • 结论

接上篇:《Fluent Bit系列:字符集转码测试(上)》https://blog.csdn.net/qq_40477248/article/details/150776142?spm=1001.2014.3001.5501

fluent-bit 1.9.4 转换测试

1、测试使用配置文件

[SERVICE]Flush 1Parsers_File parsers.confHTTP_Server  OnHTTP_Listen  0.0.0.0HTTP_PORT    3194
[INPUT]Name         tailTag          regex-fluentDB           ./db/regex-fluent.dbRead_from_Head truePath  /var/log/pods/logtest/*.logPath_Key  pod_log_path
[FILTER]Name modifyMatch *Add paas_log_belong         userAdd paas_log_type           middlewareAdd paas_collection_type    userfileAdd paas_account_id         123456789Add paas_region_id          lftstAdd paas_product_id         cccAdd paas_instance_name      test10Add paas_host_ip            127.0.0.1Add paas_manager_ip         127.0.0.1Add pod_namespace           defaultAdd pod_name                test-0Add pod_container_name      test 
[FILTER]Name                  multilineMatch                 *multiline.key_content logmultiline.parser      multiline-regex-goemitter_mem_buf_limit 2048M
[FILTER]Name    luaMatch   *script  gbk2utf8.luacall   convert_gbk_to_utf8
[OUTPUT]Name fileMatch *Path /vdata/logtest

核查数据结果

# 检查采集结果命令
less /vdata/logtest/regex-fluent #提前获取中文字符
egrep -n '匹配费率为0,不进行计价|to_number' regex-fluent | head -n 5

以下是采集后的结果被输入到 /vdata/logtest/regex-fluent 文件中。在Linux系统中,通过使用 grep 命令搜索关键字,可以看到文件的第135、137行,确认没有出现任何 GBK 格式的乱码。这表明 Fluent Bit 1.9.4 成功地将 GBK 格式转换为 UTF-8。此外,第24、40、41行的内容也确认了日志被正确合并。 但是我发现这个正则没有处理日志文件中的空行,导致采集的数据有空白。

[root@cdp-10-191-193-8 logtest]# egrep -n '匹配费率为0,不进行计价|to_number' regex-fluent | head -n 5
24:regex-fluent: [1720149989.173038023, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":"329 2023-09-11 17:03:30 [/data01/heht30/guoc//stl_jsdev/src/load_shm.c:fLoadTBillCycle:401] [INFO]sql is [select \na.bill_cycle_seq, \na.bill_period_id, \nto_number(to_char(nvl(a.start_date, add_months(sysdate, 0)), 'yyyymmdd')), \nto_number(to_char(nvl(a.cutoff_date, add_months(sysdate, 240)), 'yyyymmdd')), \nnvl(a.split_table_postfix,-1), \nnvl(b.latn_id,-1), \nto_number(b.status) ","paas_instance_name":"test10"}]
40:regex-fluent: [1720149989.173041374, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":" to_number(nvl(to_char(active_date,'yyyymmdd'),'19700101')), ","paas_instance_name":"test10"}]
41:regex-fluent: [1720149989.173046930, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":" to_number(to_char(nvl(inactive_date,add_months(sysdate,360)),'yyyymmdd'))","paas_instance_name":"test10"}]
135:regex-fluent: [1720149989.173070565, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":"329 2023-09-11 17:04:53 [/data01/heht30/guoc//stl_jsdev/src/stl_pub.c:CdrRating:761] [INFO]匹配费率为0,不进行计价","paas_instance_name":"test10"}]
137:regex-fluent: [1720149989.173070837, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":"329 2023-09-11 17:04:53 [/data01/heht30/guoc//stl_jsdev/src/stl_pub.c:CdrRating:761] [INFO]匹配费率为0,不进行计价","paas_instance_name":"test10"}]

下面是日志文件的前三行

[root@cdp-10-191-193-8 logtest]# head -n 3 /home/dcos/fluentbit/20230911_filerate-3105-9c659b484-kfp5j.0.log 
329 2023-09-11 17:03:29 [/data01/heht30/guoc//stl_jsdev/src/stl_main.c:main:328] [INFO][REPEAT_MESSAGE=2|MESSAGE_SEQUENCE=3105:72:239|THIS_WORKFLOW_ID=72|INSERT_TIME=20230911164216|THIS_NODE_ID=3|GROUP_ID=3105|WORKFLOW_ID=72|FILE_NAME=991000PTSVDA022023090116581100000005.filefmt|PROVINCE_ID=99|FILE_PATH=/jzjs_month/spcp//2799/20230901/99|FMT_NORMAL_REC=10853],    ���ۿ�ʼʱ��=[2023-09-11 17:03:29]329 2023-09-11 17:03:29 [/data01/heht30/guoc//stl_jsdev/src/stl_main.c:main:341] [ALERT]REPEAT_MESSAGE=2|MESSAGE_SEQUENCE=3105:72:239|THIS_WORKFLOW_ID=72|INSERT_TIME=20230911164216|THIS_NODE_ID=3|GROUP_ID=3105|WORKFLOW_ID=72|FILE_NAME=991000PTSVDA022023090116581100000005.filefmt|PROVINCE_ID=99|FILE_PATH=/jzjs_month/spcp//2799/20230901/99|FMT_NORMAL_REC=10853

下面是采集文件的前三行内容,第二行也就是时间戳为“1720149989.173032110”的行,没有正常处理空白行。

[root@cdp-10-191-193-8 logtest]# head -n 3 /vdata/logtest/regex-fluent 
regex-fluent: [1720149989.173026110, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":"329 2023-09-11 17:03:29 [/data01/heht30/guoc//stl_jsdev/src/stl_main.c:main:328] [INFO][REPEAT_MESSAGE=2|MESSAGE_SEQUENCE=3105:72:239|THIS_WORKFLOW_ID=72|INSERT_TIME=20230911164216|THIS_NODE_ID=3|GROUP_ID=3105|WORKFLOW_ID=72|FILE_NAME=991000PTSVDA022023090116581100000005.filefmt|PROVINCE_ID=99|FILE_PATH=/jzjs_month/spcp//2799/20230901/99|FMT_NORMAL_REC=10853],    批价开始时间=[2023-09-11 17:03:29]","paas_instance_name":"test10"}]
regex-fluent: [1720149989.173032110, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":" ","paas_instance_name":"test10"}]
regex-fluent: [1720149989.173032547, {"pod_log_path":"/var/log/pods/logtest/20230911_filerate-3105-9c659b484-kfp5j.0.log","paas_host_ip":"127.0.0.1","paas_log_belong":"user","pod_namespace":"default","paas_log_type":"middleware","pod_name":"test-0","paas_collection_type":"userfile","pod_container_name":"test","paas_account_id":"123456789","paas_region_id":"lftst","paas_manager_ip":"127.0.0.1","paas_product_id":"ccc","log":"329 2023-09-11 17:03:29 [/data01/heht30/guoc//stl_jsdev/src/stl_main.c:main:341] [ALERT]REPEAT_MESSAGE=2|MESSAGE_SEQUENCE=3105:72:239|THIS_WORKFLOW_ID=72|INSERT_TIME=20230911164216|THIS_NODE_ID=3|GROUP_ID=3105|WORKFLOW_ID=72|FILE_NAME=991000PTSVDA022023090116581100000005.filefmt|PROVINCE_ID=99|FILE_PATH=/jzjs_month/spcp//2799/20230901/99|FMT_NORMAL_REC=10853 ","paas_instance_name":"test10"}]

结论

Fluent Bit 1.9.4 和 3.0.2 均能够通过此 Lua 脚本进行字符集转换验证,证明字符集转码准确性验证成功。
由于旧版本正则表达式对空白行不进行处理,接下来所有字符集测试都基于新版本正则表达式。因此,不再使用旧版本和新版本正则表达式进行性能对比测试,因为旧版本正则表达式功能不完善,不再具备测试的必要性。

http://www.dtcms.com/a/349397.html

相关文章:

  • Dify 从入门到精通(第 55/100 篇):Dify 的模型微调(进阶篇)
  • Devops之Jenkins:Jenkins服务器中的slave节点是什么?我们为什么要使用slave节点?如何添加一个windows slave节点?
  • 如何监控ElasticSearch的集群状态?
  • Fluent Bit系列:字符集转码测试(上)
  • LengthFieldBasedFrameDecoder 详细用法
  • Error ratio tests for 200 Gb/s per lane ISLs using PMAmeasurements
  • 李沐-第十章-实现Seq2SeqAttentionDecoder时报错
  • 什么是事件循环(Event Loop)?浏览器和 Node.js 中的事件循环有什么区别?
  • springboot整合druid(多数据源配置)
  • Python_occ 学习记录 | 阵列
  • 李沐-第十章-训练Seq2SeqAttentionDecoder报错
  • 十九、云原生分布式存储 CubeFS
  • 剧本杀APP系统开发:打造多元化娱乐生态的先锋力量
  • Go编写的轻量文件监控器. 可以监控终端上指定文件夹内的变化, 阻止删除,修改,新增操作. 可以用于AWD比赛或者终端应急响应
  • TensorFlow深度学习实战(34)——TensorFlow Probability
  • GO学习记录八——多文件封装功能+redis使用
  • Node.js(2)—— Buffer
  • 安卓Android低功耗蓝牙BLE连接异常报错133
  • Docker Compose 部署 Elasticsearch 8.12.2 集成 IK 中文分词器完整指南
  • Go初级三
  • 上海AI实验室突破扩散模型!GetMesh融合点云与三平面,重塑3D内容创作
  • 少儿舞蹈小程序需求规格说明书
  • AutoCAD Electrical缺少驱动程序“AceRedist“解决方法
  • 【STM32】G030单片机的独立看门狗
  • ELKB日志分析平台 部署
  • 完美世界招数据仓库工程师咯
  • ArcGIS JSAPI 高级教程 - 创建渐变色材质的自定义几何体
  • three.js+WebGL踩坑经验合集(8.3):合理设置camera.near和camera.far缓解实际场景中的z-fighting叠面问题
  • 大数据平台ETL任务导入分库分表数据
  • Jenkins+docker 微服务实现自动化部署安装和部署过程