使用python访问mindie部署的vl多模态模型
说明
今天使用mindie1.0部署了qwen2_7b_vl模型,测试过程出现一些问题,这里总结下。
问题1:transformers版本太低
报错信息:
[ERROR] [model_deploy_config.cpp:159] Failed to get vocab size from tokenizer wrapper with exception: ModuleNotFoundError: No module named 'transformers.models.qwen2_vl'
错误分析:qwen2_7b_vl需要transformers的版本至少为4.46.0,而我的版本是4.44。模型目录下config.json中声明的transformers版本好像是4.41, 这明显是错的。
解决办法:
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
pip install transformers==4.46.0
问题2:mindie的vl接口格式和openai接口不兼容
报错信息:
Invalid base64 url
标准的openai接口传输图片格式如下:
"content": [
{"type": "text", "text": text},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_str}"
}
}
]
而mindie支持的格式如下:
"content": [
{"type": "text", "text": text},
{
"type": "image_url",
"image_url": f"{base64_str}"
}
]
问题3:
报错信息:
This model's maximum input ids length cannot be greater than 2048,the input ids length is 2831
问题分析:图片加文字的token超长了,需要增加mindie的配置文件的config.json中的maxSeqLen和maxInputTokenLen、maxIterTimes。
完整代码
import os
import requests
import base64
import json
import time
def test_multimodal_model(image_path, text, model_url, model_name):
# 将图片转换为base64编码
with open(image_path, "rb") as image_file:
encoded_image = base64.b64encode(image_file.read()).decode('utf-8')
# 构建请求数据
payload = {
"model": model_name,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": text},
{
"type": "image_url",
"image_url": f"{encoded_image}" #f"/path/to/pic.jpg"
}
]
}
],
"max_tokens": 300
}
# 发送请求
headers = {
"Content-Type": "application/json"
}
start_time = time.time() # 记录开始时间
response = requests.post(
model_url,
headers=headers,
json=payload
)
end_time = time.time() # 记录结束时间
print(f"请求耗时: {end_time - start_time:.2f}秒") # 打印耗时
# 返回响应
if response.status_code == 200:
return response.text
else:
raise Exception(f"API请求失败: {response.status_code}, {response.text}<<<<")
# 测试函数
if __name__ == "__main__":
name = "llm_model"
model_url = "http://xx.xx.xx.xx:xx/v1/chat/completions"
pic_path = "./huochepiao.jpg"
text = "请描述图片内容"
try:
result = test_multimodal_model(pic_path, text, model_url, name)
print(f"result:>>>{result}")
except Exception as e:
print(f"发生错误: {e}")