从国标到自动化:VSTO实现身份证智能解析(待测)
前言:
不论是会计、造价还是CAD绘图都会遵循一定的国标! 为了实现“知识库国标读取→VSTO交互与数据处理→会计自动化、造价自动化、CAD自动化绘图”
下面进行一个小型代码测试编写:
利用AI知识库存储GB11643-1999《公民身份号码》国家标准,比如扣子或者RAGflow存储PDF或者word文档,并结合VSTO制作身份证号码含义识别工具,核心是实现“知识库国标读取→VSTO交互与数据处理→身份证信息解析”的闭环。
一、知识库交互层(Python - 基于 RAGflow 实现国标读取)
# id_card_standard_rag.py
from ragflow import RAGFlow
import reclass IdCardStandardRAG:def __init__(self, doc_path):# 初始化RAGFlow并加载国标文档self.rag = RAGFlow()self.doc_id = self.rag.add_document(doc_path) # 加载GB11643-1999文档self.rag.build_index() # 构建索引def get_standard_info(self, question):"""查询国标中关于身份证号码的规则"""return self.rag.query(question, doc_ids=[self.doc_id])def extract_rules(self):"""从国标中提取关键解析规则"""rules = {}# 1. 长度规则len_resp = self.get_standard_info("身份证号码总长度是多少位?")rules['length'] = int(re.search(r'\d+', len_resp).group()) if len_resp else 18# 2. 地址码规则addr_resp = self.get_standard_info("前6位地址码的含义?")rules['address_code'] = {'start': 0,'end': 6,'desc': addr_resp}# 3. 出生日期码规则birth_resp = self.get_standard_info("出生日期码的位置和格式?")rules['birth_code'] = {'start': 6,'end': 14,'desc': birth_resp}# 4. 顺序码规则seq_resp = self.get_standard_info("顺序码的位置和含义?")rules['seq_code'] = {'start': 14,'end': 17,'desc': seq_resp}# 5. 校验码规则check_resp = self.get_standard_info("校验码的计算规则?")rules['check_code'] = {'start': 17,'end': 18,'desc': check_resp,'factors': [7, 9, 10, 5, 8, 4, 2, 1, 6, 3, 7, 9, 10, 5, 8, 4, 2] # 国标校验因子}return rules# 使用示例
if __name__ == "__main__":rag = IdCardStandardRAG("GB11643-1999.pdf") # 替换为实际文档路径print("提取的国标规则:", rag.extract_rules())
二、VSTO 插件(C# - Excel 插件实现身份证解析)
// ThisAddIn.cs
using System;
using System.Data;
using System.Net.Http;
using System.Threading.Tasks;
using Newtonsoft.Json;
using Excel = Microsoft.Office.Interop.Excel;
using Microsoft.Office.Tools.Excel;namespace IdCardParserAddIn
{public partial class ThisAddIn{private const string RAG_API_URL = "http://localhost:5000/get_rules"; // 本地RAG服务地址private dynamic _standardRules; // 存储从知识库获取的国标规则private void ThisAddIn_Startup(object sender, EventArgs e){// 启动时加载国标规则LoadStandardRulesAsync().Wait();// 添加自定义功能区this.Application.WorkbookOpen += Application_WorkbookOpen;}private async Task LoadStandardRulesAsync(){using (var client = new HttpClient()){var response = await client.GetAsync(RAG_API_URL);if (response.IsSuccessStatusCode){string json = await response.Content.ReadAsStringAsync();_standardRules = JsonConvert.DeserializeObject(json);}}}private void Application_WorkbookOpen(Excel.Workbook Wb){// 为选中单元格添加解析按钮var worksheet = (Excel.Worksheet)this.Application.ActiveSheet;var button = worksheet.Controls.AddButton(10, 10, 100, 30, "解析身份证");button.Click += Button_Click;}private void Button_Click(object sender, EventArgs e){var worksheet = (Excel.Worksheet)this.Application.ActiveSheet;Excel.Range selectedRange = this.Application.Selection;foreach (Excel.Range cell in selectedRange){if (cell.Value != null && IsValidIdCard(cell.Value.ToString())){var result = ParseIdCard(cell.Value.ToString());// 输出解析结果到相邻单元格cell.Offset[0, 1].Value = result["address"];cell.Offset[0, 2].Value = result["birthdate"];cell.Offset[0, 3].Value = result["gender"];cell.Offset[0, 4].Value = result["validity"];}}}private bool IsValidIdCard(string idCard){// 基于国标规则验证长度if (idCard.Length != _standardRules.length)return false;// 校验码验证(根据国标规则实现)char[] idChars = idCard.ToCharArray();int sum = 0;for (int i = 0; i < 17; i++){sum += int.Parse(idChars[i].ToString()) * (int)_standardRules.check_code.factors[i];}string checkCode = "10X98765432".Substring(sum % 11, 1);return idChars[17].ToString().ToUpper() == checkCode;}private dynamic ParseIdCard(string idCard){return new{address = $"地址码:{idCard.Substring((int)_standardRules.address_code.start, 6)}(规则:{_standardRules.address_code.desc})",birthdate = $"出生日期:{idCard.Substring((int)_standardRules.birth_code.start, 8).Insert(6, "-").Insert(4, "-")}",gender = $"性别:{(int.Parse(idCard.Substring(14, 3)) % 2 == 1 ? "男" : "女")}",validity = IsValidIdCard(idCard) ? "有效" : "无效"};}private void ThisAddIn_Shutdown(object sender, EventArgs e) { }#region VSTO 生成的代码private void InternalStartup(){this.Startup += new EventHandler(ThisAddIn_Startup);this.Shutdown += new EventHandler(ThisAddIn_Shutdown);}#endregion}
}
三、服务部署说明
知识库部署:
- 将 GB11643-1999 文档(PDF/Word)放入 RAGflow 可访问路径
- 启动 Python 服务作为中间层(可使用 Flask/FastAPI 暴露
/get_rules
接口)
VSTO 插件配置:
- 在 Visual Studio 中创建 Excel VSTO 项目,添加上述代码
- 引用
Newtonsoft.Json
处理 JSON 序列化 - 确保 RAG 服务地址与插件中
RAG_API_URL
一致
功能流程:
- 插件启动时从 RAG 服务获取国标规则
- 用户在 Excel 中选中身份证号码单元格,点击 "解析身份证" 按钮
- 插件基于国标规则解析信息(地址码、出生日期、性别等)并输出到相邻单元格
通过这种方式,可将任何领域的国标规则转化为自动化工具的 “知识库”,实现业务流程的标准化和自动化,确保合规性。