vanna多表关联的实验
vanna多表关联的实验
- 1.实验说明
- 2.数据库准备
- 3.启动VANNA的程序
- 4.交互
1.实验说明
本次测试vanna实现多表关联的数据查询,效果还可以,直接要多次交互下。
环境安装参见本人的另一篇文章
本次使用采用的是postgresql作为数据库,并将LLM采用qwen3:8b
其中安装vanna使用:
pip install 'vanna[chromadb,ollama,mysql,postgresql]' -i https://pypi.tuna.tsinghua.edu.cn/simple/
2.数据库准备
自行安装postgresql数据库,可以采用容器化安装,方便一些,安装后创建数据库demodb
-- 如下在demodb数据库中执行
-- 创建客户表
CREATE TABLE cust_info (cust_id INT PRIMARY KEY, -- 客户ID(主键)cust_name VARCHAR(50) NOT NULL -- 客户名(非空)
);
COMMENT ON TABLE cust_info IS '客户基本信息表';
COMMENT ON COLUMN cust_info.cust_id IS '唯一客户标识';
COMMENT ON COLUMN cust_info.cust_name IS '客户全名';-- 创建账户表(含外键约束)
CREATE TABLE acct_info (acct_id SERIAL PRIMARY KEY, -- 账户ID(自增主键)cust_id INT NOT NULL, -- 关联客户IDbal NUMERIC(12,2) DEFAULT 0.00, -- 账户余额(默认0,精确到分)FOREIGN KEY (cust_id) REFERENCES cust_info(cust_id) ON DELETE CASCADE
);
COMMENT ON TABLE acct_info IS '客户账户信息表';
COMMENT ON COLUMN acct_info.acct_id IS '唯一账户标识';
COMMENT ON COLUMN acct_info.cust_id IS '关联客户ID(外键)';
COMMENT ON COLUMN acct_info.bal IS '账户余额(单位:元)';INSERT INTO cust_info (cust_id, cust_name) VALUES
(1, '张明'), (2, '李华'), (3, '王芳'), (4, '刘洋'),
(5, '陈思'), (6, '赵雷'), (7, '周琪'), (8, '吴越'),
(9, '郑宇'), (10, '孙琳');INSERT INTO acct_info (cust_id, bal) VALUES
-- 客户1的6个账户
(1, 15200.50), (1, 8730.00), (1, 42150.75),
(1, 9300.25), (1, 15600.00), (1, 3200.40),
-- 客户2的3个账户
(2, 78000.00), (2, 14500.60), (2, 9200.30),
-- 客户3的8个账户
(3, 12500.00), (3, 36700.50), (3, 8900.25),
(3, 15400.75), (3, 23000.00), (3, 4200.90),
(3, 17600.30), (3, 29500.45),
-- ... 其他客户账户(共50条)
(10, 45000.00), (10, 12800.20), (10, 7600.80), (10, 21500.35);
3.启动VANNA的程序
大部分不改,只改了链接ollama的模型,和连接数据库的部分,vn.train部分可以不要
from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStoreclass MyVanna(ChromaDB_VectorStore, Ollama):def __init__(self, config=None):ChromaDB_VectorStore.__init__(self, config=config)Ollama.__init__(self, config=config)vn = MyVanna(config={'model': 'qwen3:8b','ollama_host':'http://192.168.184.1:11434'})# vn.connect_to_mysql(host='192.168.184.190', dbname='demodb', user='root', password='PG_Dev2022a', port=3306)vn.connect_to_postgres(host='192.168.184.190', dbname='demodb', user='postgres', password='PG_Dev2022a', port=5432)# The information schema query may need some tweaking depending on your database. This is a good starting point.
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)vn.train(ddl="""-- 创建客户表
CREATE TABLE cust_info (cust_id INT PRIMARY KEY, -- 客户ID(主键)cust_name VARCHAR(50) NOT NULL -- 客户名(非空)
);
COMMENT ON TABLE cust_info IS '客户基本信息表';
COMMENT ON COLUMN cust_info.cust_id IS '唯一客户标识';
COMMENT ON COLUMN cust_info.cust_name IS '客户全名';-- 创建账户表(含外键约束)
CREATE TABLE acct_info (acct_id SERIAL PRIMARY KEY, -- 账户ID(自增主键)cust_id INT NOT NULL, -- 关联客户IDbal NUMERIC(12,2) DEFAULT 0.00, -- 账户余额(默认0,精确到分)FOREIGN KEY (cust_id) REFERENCES cust_info(cust_id) ON DELETE CASCADE
);
COMMENT ON TABLE acct_info IS '客户账户信息表';
COMMENT ON COLUMN acct_info.acct_id IS '唯一账户标识';
COMMENT ON COLUMN acct_info.cust_id IS '关联客户ID(外键)';
COMMENT ON COLUMN acct_info.bal IS '账户余额(单位:元)';
""")from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)app.run()
4.交互
问题:按照客户维度,汇总账户余额,倒序排列,展示的字段是客户ID,客户名称,余额,排名号。注意余额是数值,没有找到账户,那么余额就是0
说明:
-
实际上,多次交互才摸清这个玩意,如果没有提示说余额是数值,排序中没关联上的bal展示成null,显然不太符合通常的情况。
-
多表关联,最重要的是将表间关系作为训练数据,给模型说明清楚了,不然模型猜肯定会有偏差。