当前位置: 首页 > news >正文

修改PostgreSQL测试脚本使之在cedardb中运行并分析日志

先到cedardb官方网站下载二进制文件。

root@66d4e20ec1d7:/par/cedar10# tar xf "../cedar-legacy-arm64.tar.xz"
root@66d4e20ec1d7:/par/cedar10# ls
cedar
root@66d4e20ec1d7:/par/cedar10# cd cedar
root@66d4e20ec1d7:/par/cedar10/cedar# ls
README.md  catalogupgrade.py  cedardb
root@66d4e20ec1d7:/par/cedar10/cedar# cat README.md

参照readme文档启动交互式界面。
cedardb也不支持存储过程,所以在duckdb版本基础上修改

2025-11-05 04:00:31.017900630 UTC	ERROR:   DO not implemented yet

cedardb既没有printf函数也没有make_interval函数

2025-11-05 04:01:25.329726210 UTC	ERROR:   unknown function or overload printf(text, double precision)
at:   now() - printf('%.0f days', random() * 365) *error* ::interval2025-11-05 04:08:58.400727980 UTC	ERROR:   unknown function or overload make_interval(integer)
at:   now() - make_interval(days => floor(random() * 365)::int) *error* 

但是它支持interval字面量乘以一个随机数。

> select now() - interval '365'*random();
?column?
2024-11-15 07:11:12.946666+08

修改后脚本如下

-- ========================================
--  Step 1. 创建基础表结构
-- ========================================
DROP TABLE IF EXISTS orders CASCADE;DROP TABLE IF EXISTS customers CASCADE;
DROP TABLE IF EXISTS products CASCADE;
CREATE SEQUENCE id_sequence;
CREATE SEQUENCE id_sequence2;
CREATE SEQUENCE id_sequence3;
CREATE TABLE customers (customer_id int PRIMARY KEY DEFAULT nextval('id_sequence'),name TEXT,region TEXT,join_date TIMESTAMP
);CREATE TABLE products (product_id int PRIMARY KEY DEFAULT nextval('id_sequence2'),name TEXT,category TEXT,price NUMERIC(10,2)
);CREATE TABLE orders (order_id int PRIMARY KEY DEFAULT nextval('id_sequence3'),customer_id INT REFERENCES customers(customer_id),product_id INT REFERENCES products(product_id),quantity INT,amount NUMERIC(12,2),order_date TIMESTAMP
);-- ========================================
--  Step 2. 批量插入测试数据
-- ========================================-- 2.1 插入 customers(100万)
INSERT INTO customers (name, region, join_date)
SELECT'Customer_' || (gs + (b-1)*100000),CASE WHEN random() < 0.33 THEN 'APAC'WHEN random() < 0.66 THEN 'EMEA'ELSE 'AMER' END,now() - interval '365'*random()
FROM generate_series(1,100000) AS gs(gs),generate_series(1,10)b(b) ;-- 2.2 插入 products(100万)
INSERT INTO products (name, category, price)
SELECT'Product_' || (gs + (b-1)*100000),CASE WHEN random() < 0.5 THEN 'Electronics'WHEN random() < 0.8 THEN 'Clothing'ELSE 'Food' END,round((random() * 500 + 1)::numeric, 2)
FROM generate_series(1,100000) AS gs(gs),generate_series(1,10)b(b) ;-- 2.3 插入 orders(100万)
INSERT INTO orders (customer_id, product_id, quantity, amount, order_date)
SELECTfloor(random() * 1000000 + 1)::int,floor(random() * 1000000 + 1)::int,floor(random() * 5 + 1)::int,round(((floor(random()*5)+1) * (random()*500 + 1))::numeric, 2),now() - interval '365'*random()
FROM generate_series(1,100000) AS gs(gs),generate_series(1,10)b(b) ;-- ========================================
--  Step 3. 建立索引
-- ========================================
CREATE INDEX idx_orders_cust ON orders(customer_id);
CREATE INDEX idx_orders_prod ON orders(product_id);
CREATE INDEX idx_orders_date ON orders(order_date);
CREATE INDEX idx_customers_region ON customers(region);
CREATE INDEX idx_products_cat ON products(category);VACUUM ANALYZE;-- ========================================
--  Step 4. 构造复杂查询(含 CTE + 子查询 + 聚合 + JOIN)
-- ========================================EXPLAIN ANALYZE
WITH recent_orders AS (SELECT o.order_id, o.customer_id, o.product_id, o.amount, o.order_dateFROM orders oWHERE o.order_date > now() - interval '180 days'
),
customer_region_sales AS (SELECTc.region,SUM(r.amount) AS total_sales,COUNT(DISTINCT r.customer_id) AS unique_customersFROM recent_orders rJOIN customers c ON r.customer_id = c.customer_idGROUP BY c.region
),
top_products AS (SELECTp.category,SUM(r.amount) AS total_salesFROM recent_orders rJOIN products p ON r.product_id = p.product_idGROUP BY p.category
)
SELECTcr.region,tp.category,cr.total_sales AS region_sales,tp.total_sales AS category_sales,(cr.total_sales / tp.total_sales)::numeric(10,2) AS ratio
FROM customer_region_sales cr
JOIN top_products tp ON tp.total_sales > 0
WHERE cr.total_sales > (SELECT AVG(total_sales) FROM customer_region_sales
)
ORDER BY ratio DESC
LIMIT 20;

执行结果如下

root@66d4e20ec1d7:/par/cedar10/cedar# ./cedardb -i --createdb mydbroot@66d4e20ec1d7:/par/cedar10/cedar# ./cedardb -i mydb
2025-11-05 04:06:23.002092990 UTC	INFO:    Using 3467 MB buffers, 3467 MB work memory
2025-11-05 04:06:23.859818070 UTC	WARNING: Your storage device uses write back caching. Write durability might be delayed. See: https://cedardb.com/docs/references/writecache
2025-11-05 04:06:23.859866960 UTC	INFO:    You're running CEDARDB COMMUNITY EDITION - using 0 GB out of 64 GB. Our General Terms and Conditions apply to the use of the CedarDB Community Edition. Run "cedardb --license" for more information.
> \timing on2025-11-05 04:21:50.116719930 UTC	ERROR:   The query contains a large implicit cross product. This is most likely due to unintended missing join conditions and requires a fix to the query. If you really want that cross product please use an explicit CROSS JOIN. Or disable this check with SET implicit_cross_products = ON.

出错了,默认不允许执行大的笛卡尔积,设置参数后继续执行。

> SET implicit_cross_products = ON;
> \i /par/test_pgcd.txt
2025-11-05 04:22:39.993870960 UTC	INFO:     [s] execution: (0.000013 min, 0.000013 max, 0.000013 median, 0.0% relMAD, 0.000013 avg, 0.000000 sdev, 1 scale, nan IPC, nan CPUs, nan GHz) compilation: (0.000031 min, 0.000031 max, 0.000031 median, 0.0% relMAD, 0.000031 avg, nan sdev)...🖩 OUTPUT ()
▼ SORT (In-Memory) (Card: 0, Estimate: 3, Time: 0 ms (1 %))
𝚾 MAP (Card: 0, Estimate: 3)JOIN (inner, bnljoin) (Card: 0, Estimate: 3)
├───σ SELECT (Card: 0, Estimate: 3)
│   𝚪 GROUP BY (In-Memory) (Card: 0, Estimate: 3, Time: 0 ms (0 %))
│   ⨝ JOIN (inner, hashjoin) (Materialized: 11 MB, Utilization: 163 %, Card: 0, Estimate: 626'633)
│   ├───🗐 TABLESCAN on orders (num IOs: 0, Fetched: 0 B, Card: 0, Estimate: 500'000, Time: 0 ms (1 %))
│   └───🗐 TABLESCAN on products (num IOs: 0, Fetched: 0 B, Card: 0, Estimate: 1'000'000, Time: 0 ms (1 %))
└───⨝ JOIN (inner, singletonjoin) (Card: 0, Estimate: 1)├───σ SELECT (Card: 0, Estimate: 1)│   𝚪 GROUP BY (In-Memory) (Card: 0, Estimate: 1, Time: 0 ms (0 %))│   σ SELECT (Card: 0, Estimate: 2)│   🗏 CTE Scan on CTE Alder (Card: 0, Estimate: 2, Time: 0 ms (0 %))└───σ SELECT (Card: 0, Estimate: 2)🗏 CTE Scan on CTE Alder (Card: 0, Estimate: 2, Time: 1 ms (93 % ***))This plan uses:
> CTE Alder:
t TEMP (Card: 0, Estimate: 2)
σ SELECT (Card: 0, Estimate: 2)
𝚪 GROUP BY (In-Memory) (Card: 0, Estimate: 2, Time: 0 ms (0 %))JOIN (inner, hashjoin) (Materialized: 11 MB, Utilization: 163 %, Card: 0, Estimate: 608'021)
├───🗐 TABLESCAN on orders (num IOs: 0, Fetched: 0 B, Card: 0, Estimate: 500'000, Time: 0 ms (2 %))
└───🗐 TABLESCAN on customers (num IOs: 0, Fetched: 0 B, Card: 0, Estimate: 1'000'000, Time: 0 ms (2 %))2025-11-05 04:22:56.983110170 UTC	INFO:     [s] execution: (0.004108 min, 0.004108 max, 0.004108 median, 0.0% relMAD, 0.004108 avg, 0.000000 sdev, 4000001 scale, nan IPC, nan CPUs, nan GHz) compilation: (0.017746 min, 0.017746 max, 0.017746 median, 0.0% relMAD, 0.017746 avg, 0.000000 sdev)
> 

cedardb执行计划的树形表示很简洁,但它的计时日志有点冗余,难以查看,将其保存为文件。我用如下shell命令计算出了执行和编译的总时间

awk '{printf "%f %s %s\n", $9+$32, $9,$32}' /shujv/par/tim.txt0.000044 0.000013 0.000031
0.000012 0.000004 0.000008
0.000009 0.000003 0.000006
0.000035 0.000024 0.000011
0.000016 0.000009 0.000007
0.000014 0.000008 0.000006
0.019229 0.019162 0.000067
0.019442 0.019352 0.000090
0.071536 0.071431 0.000105
1.799576 1.619629 0.179947
1.706715 1.650359 0.056356
4.701688 4.673385 0.028303
0.295118 0.295042 0.000076
0.185758 0.185697 0.000061
0.177300 0.177245 0.000055
0.427089 0.427038 0.000051
0.311687 0.311618 0.000069
0.000052 0.000026 0.000026

可见,最耗时的部分仍是插入语句,但都比DuckDB快。

http://www.dtcms.com/a/577204.html

相关文章:

  • “融资热潮”来临!商用车自动驾驶拐点已至?
  • 告别资源瓶颈与漫长周期:覆盖自动驾驶全研发周期的SiL验证方案
  • SQL50+Hot100系列(11.6)
  • 【Ubuntu】Ubuntu 服务器升级系统操作记录
  • 模具厂咋做网站阿里巴巴网站分类板块做全屏
  • openvela 时钟框架概述
  • 中国室内设计师资格证小企业如何优化网站建设
  • 排序算法稳定性判断
  • 全面详解常见网络协议默认端口号及其应用场景
  • 详解 零拷贝(Zero Copy):mmap、sendfile、DMA gather、splice
  • 学Java第四十二天--------Arrays工具类和Lambda表达式
  • PDF-XChange Editor丨加拿大PDF编辑转换工具
  • 家里的飞牛NAS连接的明明是千兆宽带,异地访问时网速都不对?
  • Eclipse 查找
  • 网站推广计划建设手机网站包括哪些费用
  • 想要导航网站推广怎么做南山区
  • macOS 系统下 Chrome 浏览器安装 HTTPS 证书完整指南
  • HTTP 401 状态码详解:未授权的含义与解决办法
  • Java之lambda表达式
  • JavaSe—Stream流☆
  • 如何用ae做模板下载网站wordpress搭建知识库
  • 网站开发需求清单南昌seo搜索排名
  • N32H高性能32位MCU在具身机器人上的应用
  • 网站广告收费标准电子工程网络通信的专业课
  • 实时将大模型的解决方案转换为随机应变的机器人指令
  • 在 Vue 3 + Vite 项目中使用 Less 实现自适应布局:VW 和 VH 的应用
  • codeforces1914 C~F
  • 海外住宅ip怎么区分干净程度以及怎么选择海外住宅ip
  • 酒店团购的网站建设承德网媒
  • 在网站中动态效果怎么做网站的备案要求