把dxf转化成图片喂给vlm实现图纸检查比如尺寸有没有漏标
下面是一个完整的 Python 实现方案,用于将 DXF 文件转换为图像,并喂给视觉语言模型(VLM)进行图纸检查(如:识别尺寸是否漏标)。
我们将使用:
ezdxf
:解析 DXF 文件matplotlib
或opencv
+Pillow
:渲染为图像transformers
+PIL
:调用 VLM(以 LLaVA 为例)- 正则表达式:后处理输出,提取尺寸信息
✅ 第一步:安装依赖
bash
编辑
pip install ezdxf matplotlib pillow transformers torch gradio
注意:LLaVA 需要较大显存(推荐 12GB+ GPU),若无 GPU 可改用较小模型或 API 接入云端 VLM(如 Qwen-VL、MiniCPM-V)
✅ 第二步:DXF 转图像(Python 脚本)
python
编辑
# dxf_to_image.py
import ezdxf
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
import osdef dxf_to_image(dxf_path, output_image_path, dpi=300):"""将 DXF 文件渲染为高分辨率 PNG 图像"""doc = ezdxf.readfile(dxf_path)msp = doc.modelspace()# 提取所有实体的边界points = []for entity in msp:if entity.has_dxf_attrib('start'):points.append(entity.dxf.start[:2])if entity.has_dxf_attrib('end'):points.append(entity.dxf.end[:2])if entity.has_dxf_attrib('center'):points.append(entity.dxf.center[:2])if not points:print("No geometry found.")returnpoints = np.array(points)minx, miny = points.min(axis=0)maxx, maxy = points.max(axis=0)# 设置边距margin = (maxx - minx) * 0.05minx -= margin; maxx += marginminy -= margin; maxy += margin# 创建绘图fig, ax = plt.subplots(figsize=((maxx-minx), (maxy-miny)))ax.set_xlim(minx, maxx)ax.set_ylim(miny, maxy)ax.set_aspect('equal')ax.axis('off') # 关闭坐标轴# 绘制实体for entity in msp:if entity.dxftype() == 'LINE':start = entity.dxf.startend = entity.dxf.endax.plot([start[0], end[0]], [start[1], end[1]], color='black', linewidth=0.5)elif entity.dxftype() == 'CIRCLE':center = entity.dxf.centerradius = entity.dxf.radiuscircle = plt.Circle(center[:2], radius, color='black', fill=False, linewidth=0.5)ax.add_patch(circle)elif entity.dxftype() == 'ARC':# 简化:绘制为弧线passelif entity.dxftype() == 'DIMENSION':# 可选:突出标注层pass# 保存为高清图像fig.savefig(output_image_path, dpi=dpi, bbox_inches='tight', pad_inches=0, facecolor='white')plt.close(fig)print(f"Saved image to {output_image_path}")
✅ 第三步:加载 VLM 并推理(使用 LLaVA)
python
编辑
# vlm_inference.py
from transformers import AutoProcessor, LlavaForConditionalGeneration
from PIL import Image
import torch# 使用 LLaVA-1.5-7b(轻量版,适合消费级 GPU)
model_id = "llava-hf/llava-1.5-7b-hf"
processor = AutoProcessor.from_pretrained(model_id)
model = LlavaForConditionalGeneration.from_pretrained(model_id,torch_dtype=torch.float16,low_cpu_mem_usage=True
).cuda()def check_dimensions(image_path):"""输入图像路径,返回 VLM 的文本回答"""prompt = ("你是一个机械图纸审查专家。请仔细检查这张工程图纸中的尺寸标注。\n""问题:是否存在尺寸缺失?请列出所有你看到的尺寸标注(例如 X100, R25, φ10),并指出可能遗漏的关键尺寸。\n""注意:关注孔位、总长宽、圆角半径、对称结构等常见标注位置。\n""回答格式:\n""发现尺寸: X100, R25, φ8\n""可能遗漏: 总长度, 孔间距, 倒角大小")raw_image = Image.open(image_path).convert("RGB")inputs = processor(prompt, raw_image, return_tensors="pt").to("cuda", torch.float16)with torch.no_grad():output = model.generate(**inputs, max_new_tokens=200)response = processor.decode(output[0], skip_special_tokens=True)return response
✅ 第四步:后处理 —— 提取尺寸 & 对比规则
python
编辑
# postprocess.py
import redef extract_dimensions(text):"""从文本中提取尺寸:X\d+, R\d+, φ\d+, \d+.\d+mm 等"""patterns = [r'X(\d+\.?\d*)', # X100r'R(\d+\.?\d*)', # R25r'φ(\d+\.?\d*)', # φ10r'⌀(\d+\.?\d*)', # ⌀ 符号兼容r'(\d+\.?\d*)mm', # 50mmr'(\d+\.?\d*)\s*×', # 100×50]dims = []for pattern in patterns:matches = re.findall(pattern, text, re.IGNORECASE)dims.extend(matches)return list(set(dims)) # 去重def compare_with_golden(expected_dims, detected_dims):"""对比预期尺寸与检测到的尺寸expected_dims: set of str, e.g. {"100", "25", "8"}"""missing = expected_dims - set(detected_dims)extra = set(detected_dims) - expected_dimsreturn list(missing), list(extra)
✅ 第五步:主流程整合
python
编辑
# main.py
from dxf_to_image import dxf_to_image
from vlm_inference import check_dimensions
from postprocess import extract_dimensions, compare_with_goldendef main(dxf_path, expected_dims=None):img_path = "output_drawing.png"# 1. DXF → Imagedxf_to_image(dxf_path, img_path, dpi=300)# 2. VLM 推理response = check_dimensions(img_path)print("VLM 回答:")print(response)# 3. 后处理detected = extract_dimensions(response)print("\n检测到的尺寸值:", detected)# 4. 比较(可选)if expected_dims:missing, extra = compare_with_golden(set(expected_dims), detected)print("\n可能遗漏:", missing)print("多余标注:", extra)return missing, responseelse:return None, response# 使用示例
if __name__ == "__main__":dxf_file = "test.dxf" # 替换为你的文件expected = {"100", "50", "R25", "φ8"} # 可选:已知应有尺寸main(dxf_file, expected)
✅ 可视化报告(可选)
可用 gradio
快速搭建一个 Web UI:
python
编辑
import gradio as grdef analyze_dxf(file):_, response = main(file.name)return responseinterface = gr.Interface(fn=analyze_dxf,inputs=gr.File(label="上传 DXF 文件"),outputs=gr.Textbox(label="审查结果"),title="DXF 图纸尺寸标注检查器"
)
interface.launch()
🛠️ 优化建议
问题 | 解决方案 |
---|---|
DXF 渲染不完整 | 改用 CADView 或 OdaFileConverter (ODA 库)预转 PDF/SVG 再转图 |
VLM 识别不准 | 先用 OpenCV 增强线条对比度,或提示中加入“放大局部”描述 |
显存不足 | 改用 MiniCPM-V-2B(支持 4-bit 量化,6GB 显存可用) |
批量处理 | 用 os.listdir() 遍历文件夹,批量生成报告 CSV |
🔗 推荐模型(HuggingFace)
- LLaVA-1.5-7b: https://huggingface.co/llava-hf/llava-1.5-7b-hf
- MiniCPM-V-2: https://huggingface.co/openbmb/MiniCPM-V-2
- Qwen-VL-Max(API): https://help.aliyun.com/zh/dashscope/developer-reference
通过这套 Python 流程,你可以实现: ✅ 自动化 DXF → 图像 → VLM 审查
✅ 输出可能漏标的尺寸建议
✅ 扩展为自动质检系统(集成进 PLM/PDM)
需要更复杂功能(如定位缺标位置坐标),可结合 OCR + 目标检测模型(如 YOLO-NAS)联合训练。欢迎继续提