Spring AI 系列之三十六 - Spring AI Alibaba-nl2sql
之前做个几个大模型的应用,都是使用Python语言,后来有一个项目使用了Java,并使用了Spring AI框架。随着Spring AI不断地完善,最近它发布了1.0正式版,意味着它已经能很好的作为企业级生产环境的使用。对于Java开发者来说真是一个福音,其功能已经能满足基于大模型开发企业级应用。借着这次机会,给大家分享一下Spring AI框架。
注意:由于框架不同版本改造会有些使用的不同,因此本次系列中使用基本框架是 Spring AI-1.0.0,JDK版本使用的是19,Spring-AI-Alibaba-1.0.0.3-SNAPSHOT。
代码参考: https://github.com/forever1986/springai-study
目录
- 1 spring-ai-alibaba-starter-nl2sql
- 1.1 包的结构说明
- 1.2 底层原理
- 2 示例演示
- 2.1 前提准备
- 2.2 代码实现
- 2.3 演示效果
前几章讲了Spring AI Alibaba的Graph框架,它是构建智能体、工作流的基本框架。但是前几章都是在讲Graph一些常见的使用节点和流程,对于一个真正智能体来说,可能还需要更复杂的实现。这一章再来将一个Spring AI Alibaba基于Graph的扩展实现的框架:nl2sql 。在实际业务中,往往有些数据存储在关系型数据库中,而需要通过大模型自己将用户的问题生成SQL语句去查询数据库,这就是 nl2sql 的功能,在Spring AI Alibaba中就是通过Graph实现了该功能。
1 spring-ai-alibaba-starter-nl2sql
1.1 包的结构说明
spring-ai-alibaba-starter-nl2sql :自然语言到 SQL 转换组件。
<dependency><groupId>com.alibaba.cloud.ai</groupId><artifactId>spring-ai-alibaba-starter-nl2sql</artifactId>
</dependency>
当引入该组件时,可以看到其3个主要的jar,其中nl2sql-chat是最主要的实现逻辑,其中Nl2sqlConfiguration是最主要的配置:
1.2 底层原理
从上图中知道Nl2sqlConfiguration是配置的主要点,现在来看看Nl2sqlConfiguration的源码:
@Configuration
public class Nl2sqlConfiguration {// 主要的nl2sql的服务类@Autowired@Qualifier("nl2SqlServiceImpl")private BaseNl2SqlService nl2SqlService;// 将数据库的schema转换存入到向量数据库@Autowired@Qualifier("schemaServiceImpl")private BaseSchemaService schemaService;// 关系型数据库的连接@Autowiredprivate DbAccessor dbAccessor;// 关系型数据库的配置(目前支持mysql和postgrSql)@Autowiredprivate DbConfig dbConfig;// 构建图的关键@Beanpublic StateGraph nl2sqlGraph(ChatClient.Builder chatClientBuilder) throws GraphStateException {// 这里就不贴图的构建过程}
}
在Nl2sqlConfiguration的nl2sqlGraph方法中构建了一个StateGraph图,从前面知道StateGraph就是构建智能体的框架,因此nl2sqlGraph就是一个将自然语言转换为可执行SQL的智能体,其流程结构如下图:
下面说一下上面StateGraph图的各个节点作用:
1)START : 开始节点
2)QUERY_REWRITE_NODE :查询重写。问题重写与意图澄清,提升意图理解准确性。完成后需要通过condition校验:
- 校验失败:直接跳转到 END 节点,终止流程
- 校验成功:则进入KEVNORD_EXTRACT_NODE 节点
3)KEVNORD_EXTRACT_NODE :关键词提取。关键词、实体、时间等信息抽取,为后续 Schema 召回做准备。
4)SCHEMA_RECALL_NODE :模式召回。根据关键词和意图,召回相关表、字段、关系等数据库 Schema 信息。
5)TABLE_RELATION_NODE:表关系解析。推理表与字段间的关系,自动补全 Join、外键等复杂结构。
6)SQL_GENERATE_NODE :生成 SQL 语句。通过condition的校验:
- 生成成功且无需验证:直接输出结果(END)
- 生成成功: 进入 SQL_VALIDATE_NODE 节点
- 生成失败:回退至KEVNORD_EXTRACT_NODE 节点,重新进行关键词提取
7)SQL_VALIDATE_NODE:校验 SQL 语句。通过condition校验:
- 校验失败: 返回SQL_GENERATE_NODE节点,重新生成SQL
- 校验通过: 进入SEMANTIC_CONSISTENC_NODE 节点,进行语义一致性校验
8)SEMANTIC_CONSISTENC_NODE:语义一致性校验。比对SQL逻辑与原始查询意图的吻合度。经condition校验:·
- 校验不一致:返回SQL_GENERATE_NODESQL 节点,重新生成
- 校验一致:直接跳转到 END 节点,终止流程
9)END:结束节点
从上面可以知道Spring AI Alibaba-nl2sql的底层原理就是使用StateGraph构建一个智能体,下面就通过一个示例演示如何使用
2 示例演示
代码参考lesson27子模块
示例说明:通过创建用户表、商品表、订单表、订单明细表、商品分类表,然后通过问题生成对应的SQL
2.1 前提准备
1)准备mysql数据库,创建一个springai数据库,并执行一下初始化语句
-- 用户表
CREATE TABLE users (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '用户ID,主键自增',username VARCHAR(50) NOT NULL COMMENT '用户名',email VARCHAR(100) NOT NULL COMMENT '用户邮箱',created_at DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT '用户注册时间'
) COMMENT='用户表';-- 商品表
CREATE TABLE products (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '商品ID,主键自增',name VARCHAR(100) NOT NULL COMMENT '商品名称',price DECIMAL(10,2) NOT NULL COMMENT '商品单价',stock INT NOT NULL COMMENT '商品库存数量',created_at DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT '商品上架时间'
) COMMENT='商品表';-- 订单表
CREATE TABLE orders (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '订单ID,主键自增',user_id INT NOT NULL COMMENT '下单用户ID',order_date DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT '下单时间',total_amount DECIMAL(10,2) NOT NULL COMMENT '订单总金额',status VARCHAR(20) DEFAULT 'pending' COMMENT '订单状态(pending/completed/cancelled等)',FOREIGN KEY (user_id) REFERENCES users(id)
) COMMENT='订单表';-- 订单明细表
CREATE TABLE order_items (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '订单明细ID,主键自增',order_id INT NOT NULL COMMENT '订单ID',product_id INT NOT NULL COMMENT '商品ID',quantity INT NOT NULL COMMENT '购买数量',unit_price DECIMAL(10,2) NOT NULL COMMENT '下单时商品单价',FOREIGN KEY (order_id) REFERENCES orders(id),FOREIGN KEY (product_id) REFERENCES products(id)
) COMMENT='订单明细表';-- 商品分类表
CREATE TABLE categories (id INT PRIMARY KEY AUTO_INCREMENT COMMENT '分类ID,主键自增',name VARCHAR(50) NOT NULL COMMENT '分类名称'
) COMMENT='商品分类表';-- 商品-分类关联表(多对多)
CREATE TABLE product_categories (product_id INT NOT NULL COMMENT '商品ID',category_id INT NOT NULL COMMENT '分类ID',PRIMARY KEY (product_id, category_id),FOREIGN KEY (product_id) REFERENCES products(id),FOREIGN KEY (category_id) REFERENCES categories(id)
) COMMENT='商品与分类关联表';-- 用户表插入数据
INSERT INTO users (username, email) VALUES
('alice', 'alice@example.com'),
('bob', 'bob@example.com'),
('cathy', 'cathy@example.com'),
('daniel', 'daniel@example.com'),
('emily', 'emily@example.com');-- 商品分类插入数据
INSERT INTO categories (name) VALUES
('电子产品'),
('服装'),
('图书'),
('家居用品'),
('食品');-- 商品表插入数据
INSERT INTO products (name, price, stock) VALUES
('智能手机', 2999.00, 100),
('T恤衫', 89.00, 500),
('小说', 39.00, 200),
('咖啡机', 599.00, 50),
('牛奶', 15.00, 300),
('笔记本电脑', 4999.00, 30),
('沙发', 2599.00, 10),
('巧克力', 25.00, 100),
('羽绒服', 399.00, 80),
('历史书', 69.00, 150);-- 商品-分类关联数据
INSERT INTO product_categories (product_id, category_id) VALUES
(1, 1), -- 智能手机-电子产品
(2, 2), -- T恤衫-服装
(3, 3), -- 小说-图书
(4, 1), (4, 4), -- 咖啡机-电子产品、家居用品
(5, 5), -- 牛奶-食品
(6, 1), -- 笔记本电脑-电子产品
(7, 4), -- 沙发-家居用品
(8, 5), -- 巧克力-食品
(9, 2), -- 羽绒服-服装
(10, 3); -- 历史书-图书-- 订单表插入数据
INSERT INTO orders (user_id, total_amount, status, order_date) VALUES
(1, 3088.00, 'completed', '2025-06-01 10:10:00'),
(2, 39.00, 'pending', '2025-06-02 09:23:00'),
(3, 1204.00, 'completed', '2025-06-03 13:45:00'),
(4, 65.00, 'cancelled', '2025-06-04 16:05:00'),
(5, 5113.00, 'completed', '2025-06-05 20:12:00'),
(1, 814.00, 'completed', '2025-06-05 21:03:00'),
(2, 424.00, 'pending', '2025-06-06 08:10:00'),
(3, 524.00, 'completed', '2025-06-06 14:48:00'),
(4, 399.00, 'completed', '2025-06-07 10:15:00'),
(5, 129.00, 'pending', '2025-06-07 18:00:00');-- 订单明细表插入数据
INSERT INTO order_items (order_id, product_id, quantity, unit_price) VALUES
(1, 1, 1, 2999.00),
(1, 2, 1, 89.00),
(2, 3, 1, 39.00),
(3, 4, 2, 599.00),
(3, 5, 2, 3.00),
(4, 8, 2, 25.00),
(4, 5, 1, 15.00),
(5, 6, 1, 4999.00),
(5, 2, 1, 89.00),
(5, 5, 5, 5.00),
(5, 8, 1, 25.00),
(6, 9, 2, 399.00),
(6, 3, 1, 16.00),
(7, 2, 2, 89.00),
(7, 3, 3, 39.00),
(8, 10, 4, 69.00),
(9, 9, 1, 399.00),
(10, 8, 4, 25.00),
(10, 5, 1, 29.00);-- 补充部分商品价格波动或促销模拟
-- 注:部分订单明细使用了与商品表不同的 unit_price,模拟促销活动
2)新建lesson27子模块,其pom引入如下:
<properties><!-- Spring-boot版本需要改为3.4.5及以上,因为nl2sql本身引用的版本 --><spring-boot.version>3.4.5</spring-boot.version>
</properties><dependencies><dependency><groupId>com.alibaba.cloud.ai</groupId><artifactId>spring-ai-alibaba-starter-nl2sql</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency>
</dependencies>
注意:由于之前的示例使用spring-boot是3.3.0版本,而Spring AI Alibaba-nl2sql使用的Spring-boot版本是3.4.5以上,因此这里重新定义一下版本号
2.2 代码实现
1)新建application.properties配置文件
# 聊天模型
spring.ai.openai.api-key=你的阿里百炼API KEY
spring.ai.openai.model=qwen-max# embedding的api-key
spring.ai.dashscope.api-key=你的阿里百炼API KEY# 数据库配置
chatbi.dbconfig.url=jdbc:mysql://127.0.0.1:3306/springai?serverTimezone=UTC
chatbi.dbconfig.username=root
chatbi.dbconfig.password=root
#chatbi.dbconfig.schema=true
chatbi.dbconfig.connection-type=jdbc
chatbi.dbconfig.dialect-type=mysql
注意:这里使用的是阿里的千问模型,由于需要将schema进行向量存储,因此还配置dashscope的api-key,是为了加载embededing模型(也是使用阿里线上模型)
2)新建SimpleChatController 演示类:
import com.alibaba.cloud.ai.constant.Constant;
import com.alibaba.cloud.ai.dbconnector.DbConfig;
import com.alibaba.cloud.ai.graph.CompiledGraph;
import com.alibaba.cloud.ai.graph.OverAllState;
import com.alibaba.cloud.ai.graph.StateGraph;
import com.alibaba.cloud.ai.request.SchemaInitRequest;
import com.alibaba.cloud.ai.service.simple.SimpleVectorStoreService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;import java.util.Arrays;
import java.util.Map;
import java.util.Optional;@RestController
public class SimpleChatController {@Autowiredprivate StateGraph nl2sqlGraph;@Autowiredprivate SimpleVectorStoreService simpleVectorStoreService;@Autowiredprivate DbConfig dbConfig;@GetMapping("/simpleChat")public String simpleNl2Sql(@RequestParam(value = "message", defaultValue = "查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。", required = true) String message) throws Exception {SchemaInitRequest schemaInitRequest = new SchemaInitRequest();// 配置数据库schemaInitRequest.setDbConfig(dbConfig);// 将表名和字段等信息放入向量数据库schemaInitRequest.setTables(Arrays.asList("categories","order_items","orders","products","users","product_categories"));simpleVectorStoreService.schema(schemaInitRequest);// 执行图CompiledGraph compiledGraph = nl2sqlGraph.compile();Optional<OverAllState> invoke = compiledGraph.invoke(Map.of(Constant.INPUT_KEY, message));// 结果返回OverAllState overAllState = invoke.get();return overAllState.value(Constant.RESULT).get().toString();}
}
3)新建Lesson27Application 启动类:
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;// 注意:这里加载com.alibaba.cloud.ai包下,是因为Spring AI Alibaba-nl2sql并不是自动配置,因此需要扫描其路径,同时也要扫描自己路径下的包
@SpringBootApplication(scanBasePackages = {"com.alibaba.cloud.ai","com.demo.lesson27"})
public class Lesson27Application {public static void main(String[] args) {SpringApplication.run(Lesson27Application.class, args);}
}
2.3 演示效果
1)访问:http://localhost:8080/simpleChat
2)在控制台可以看到执行的日志:(执行日志中有StateGraph 的执行过程)
2025-07-09T17:07:36.106+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : 进入 QueryRewriteNode 节点
2025-07-09T17:07:36.107+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : [QueryRewriteNode] 处理用户输入: 查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。
2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : [QueryRewriteNode] 问题重写结果: 查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。
2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.QueryRewriteDispatcher : [QueryRewriteDispatcher]进入KEYWORD_EXTRACT_NODE节点
2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : 进入 KeywordExtractNode 节点
2025-07-09T17:07:42.991+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : evidences:[] , keywords: [每个分类, 已经成交, 销量最高的商品, 销售总量]
2025-07-09T17:07:42.991+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : KeywordExtractNode 节点输出 evidences:[] , keywords: [每个分类, 已经成交, 销量最高的商品, 销售总量]
2025-07-09T17:07:42.992+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.SchemaRecallNode : 进入 SchemaRecallNode 节点
2025-07-09T17:07:43.401+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.SchemaRecallNode : [SchemaRecallNode] Schema召回结果 - 表文档数量: 6, 关键词相关列文档组数: 4
2025-07-09T17:07:43.403+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : 进入 TableRelationNode 节点
2025-07-09T17:07:43.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : [TableRelationNode] 执行常规Schema选择
2025-07-09T17:07:44.737+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : [TableRelationNode] Schema处理结果: SchemaDTO(name=springai, description=null, tableCount=null, table=[TableDTO(name=categories, description=商品分类表, column=[ColumnDTO(name=name, description=分类名称, enumeration=0, range=null, type=text, samples=null, data=null, mapping=null), ColumnDTO(name=id, description=分类ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=product_categories, description=商品与分类关联表, column=[ColumnDTO(name=product_id, description=商品ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=category_id, description=分类ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[product_id]), TableDTO(name=products, description=商品表, column=[ColumnDTO(name=id, description=商品ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=created_at, description=商品上架时间, enumeration=0, range=null, type=datetime, samples=null, data=null, mapping=null), ColumnDTO(name=price, description=商品单价, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=stock, description=商品库存数量, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=name, description=商品名称, enumeration=0, range=null, type=text, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=order_items, description=订单明细表, column=[ColumnDTO(name=id, description=订单明细ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=quantity, description=购买数量, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=unit_price, description=下单时商品单价, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=product_id, description=商品ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=order_id, description=订单ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=orders, description=订单表, column=[ColumnDTO(name=user_id, description=下单用户ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=id, description=订单ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=order_date, description=下单时间, enumeration=0, range=null, type=datetime, samples=null, data=null, mapping=null), ColumnDTO(name=total_amount, description=订单总金额, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=status, description=订单状态(pending/completed/cancelled等), enumeration=0, range=null, type=text, samples=null, data=null, mapping=null)], primaryKeys=[id])], foreignKeys=[[order_items.order_id=orders.id, product_categories.category_id=categories.id, orders.user_id=users.id, product_categories.product_id=products.id, order_items.product_id=products.id]])
2025-07-09T17:07:44.739+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点
2025-07-09T17:07:45.548+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 召回信息是否满足需求:是
2025-07-09T17:07:45.549+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 开始生成SQL
2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 生成的SQL为:SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales
FROM categories c
JOIN product_categories pc ON c.id = pc.category_id
JOIN products p ON pc.product_id = p.id
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
WHERE o.status = 'completed'
GROUP BY c.id, p.id
WITH ROLLUP
HAVING category_id IS NOT NULL AND product_id IS NOT NULL
ORDER BY category_id, total_sales DESC
LIMIT 18446744073709551615 OFFSET 0
2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : SqlGenerateNode 节点执行完成
2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales
FROM categories c
JOIN product_categories pc ON c.id = pc.category_id
JOIN products p ON pc.product_id = p.id
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
WHERE o.status = 'completed'
GROUP BY c.id, p.id
WITH ROLLUP
HAVING category_id IS NOT NULL AND product_id IS NOT NULL
ORDER BY category_id, total_sales DESC
LIMIT 18446744073709551615 OFFSET 0
2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE
2025-07-09T17:08:07.826+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点
2025-07-09T17:08:07.826+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales
FROM categories c
JOIN product_categories pc ON c.id = pc.category_id
JOIN products p ON pc.product_id = p.id
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
WHERE o.status = 'completed'
GROUP BY c.id, p.id
WITH ROLLUP
HAVING category_id IS NOT NULL AND product_id IS NOT NULL
ORDER BY category_id, total_sales DESC
LIMIT 18446744073709551615 OFFSET 0
2025-07-09T17:08:07.844+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-84} inited
2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证通过
2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: true
2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验通过,跳转到节点: SEMANTIC_CONSISTENC_NODE
2025-07-09T17:08:07.852+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 进入 SemanticConsistencNode 节点
2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果详情: 不通过,并附具体原因:1. **核心逻辑验证**:当前SQL在尝试找出每个分类下销量最高的商品,但使用了`GROUP BY c.id, p.id WITH ROLLUP HAVING category_id IS NOT NULL AND product_id IS NOT NULL`的方式来进行分组和过滤,这会导致每个分类下的所有商品都被列出,而不是只列出销量最高的那一个。正确的做法应该是先计算出每个商品的总销量,然后根据分类进行分区,最后筛选出每一分区中销量最高的商品。2. **致命问题**:结果集不准确。由于上述提到的问题,最终输出不会是每个类别中销量最高的单个商品信息,而是该类别中所有商品的信息列表(尽管经过了排序),这意味着未能满足“每个分类只返回销量最高的商品”的需求。为了修正这个问题,可以考虑使用窗口函数(如`ROW_NUMBER()`)来为每个类别的商品按销量排名,再从中选择排名第一的商品。例如:WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rankFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rank = 1;此修改建议属于解决核心业务逻辑错误所必需的调整,因此原SQL基于审计原则被判定为不通过。
2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果: false
2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验结果: false,跳转节点配置
2025-07-09T17:08:28.616+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验未通过,跳转到SQL生成节点:SQL_GENERATE_NODE
2025-07-09T17:08:28.618+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点
2025-07-09T17:08:28.618+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 语义一致性校验未通过,开始重新生成SQL
2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 重新生成的SQL为:WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rankFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rank = 1;
2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rankFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rank = 1;
2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE
2025-07-09T17:08:37.382+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点
2025-07-09T17:08:37.382+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rankFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rank = 1;
2025-07-09T17:08:37.398+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-85} inited
2025-07-09T17:08:37.401+08:00 ERROR 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证失败 - 原因: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'rankFROM categories cJOIN product_categories pc ON c.id = pc.category_id' at line 8
2025-07-09T17:08:37.402+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: false
2025-07-09T17:08:37.402+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验未通过,跳转到节点: SQL_GENERATE_NODE
2025-07-09T17:08:37.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点
2025-07-09T17:08:37.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : SQL 语法校验未通过,开始重新生成SQL
2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 重新生成的SQL为:WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnkFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rnk = 1;
2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnkFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rnk = 1;
2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE
2025-07-09T17:08:49.462+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点
2025-07-09T17:08:49.462+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: WITH Sales AS (SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name,SUM(oi.quantity) AS total_sales,ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnkFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = 'completed'GROUP BY c.id, p.id
)
SELECT category_id, category_name, product_id, product_name, total_sales
FROM Sales
WHERE rnk = 1;
2025-07-09T17:08:49.476+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-86} inited
2025-07-09T17:08:49.480+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证通过
2025-07-09T17:08:49.481+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: true
2025-07-09T17:08:49.481+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验通过,跳转到节点: SEMANTIC_CONSISTENC_NODE
2025-07-09T17:08:49.482+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 进入 SemanticConsistencNode 节点
2025-07-09T17:08:55.608+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果详情: 通过该SQL语句满足了核心需求,正确地计算了每个分类下已经成交的商品销量,并且筛选出了每个分类中销量最高的商品及其销售总量。聚合逻辑、过滤条件(仅完成订单)以及输出字段都符合业务要求。未发现任何影响结果准确性的致命问题或无法简单修复的结构缺陷。
2025-07-09T17:08:55.608+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果: true
2025-07-09T17:08:55.609+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验结果: true,跳转节点配置
2025-07-09T17:08:55.609+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验通过,跳转到结束节点。
3)可以看到其生成的SQL:
with Sales as (
selectc.id as category_id,c.name as category_name,p.id as product_id,p.name as product_name,SUM(oi.quantity) as total_sales,row_number() over (partition by c.id
order bySUM(oi.quantity) desc) as rnk
fromcategories c
join product_categories pc onc.id = pc.category_id
join products p onpc.product_id = p.id
join order_items oi onp.id = oi.product_id
join orders o onoi.order_id = o.id
whereo.status = 'completed'
group byc.id,p.id )
selectcategory_id,category_name,product_id,product_name,total_sales
fromSales
wherernk = 1;
4)将SQL拿取Mysql执行,其结果如下,可以看到效果还是不错(注意:最终生成的效果与模型能力有关系,有些模型是专门用于SQL生成的,这些模型效果更好):
结语:本章通过讲解nl2sql使用方式,可以让用户便捷的搭建一个nl2sql智能体。但是nl2sql只能算是一个实现代码框架,还不能算是一个真正的智能体,Spring AI Alibaba在企业级应用中总结出很多实践经验,ChatBI(NL2SQL)就是一款轻量、高效、可扩展的 NL2SQL 智能体框架,让 Java 程序员可以快速构建基于自然语言的数据查询系统。同时在Spring AI Alibaba基础上,还实现了JManus(一款基于 Java 实现的,包含良好的前端 UI 交互界面的通用智能体产品)和DeepResearch(一款基于 Spring AI Alibaba Graph 实现的 DeepResearch 产品)两款真正智能体产品,有兴趣的朋友可以去了解一下。
Spring AI系列上一章:《Spring AI 系列之三十五 - Spring AI Alibaba-Graph框架之MCP》
Spring AI系列下一章:《Spring AI 系列之三十七 - Spring AI Alibaba-DocumentReader》