当前位置: 首页 > news >正文

上海人工智能实验室开源基于Intern-S1同等技术的轻量化开源多模态推理模型

在这里插入图片描述

简介

我们推出Intern-S1-mini——一款基于Intern-S1同等技术的轻量化开源多模态推理模型。该模型以80亿参数稠密语言模型(Qwen3)和3亿参数视觉编码器(InternViT)为底座,在5万亿token的多模态数据(含超2.5万亿科学领域token)上进行了持续预训练。这使得模型在保持通用能力的同时,能出色处理化学结构解析、蛋白质序列理解、化合物合成路线规划等专业科学任务,成为现实科研应用中得力的智能助手。

特性

  • 在语言与视觉推理基准测试(尤其是科学任务)中表现优异。

  • 基于5万亿token数据集的持续预训练,其中超50%为专业科学数据,具备深厚的领域知识沉淀。

  • 动态分词器可实现分子式与蛋白质序列的原生解析。

性能表现

我们在包括通用数据集和科学数据集在内的多种基准测试上评估了Intern-S1-mini模型。以下是其与近期视觉语言模型及大语言模型的性能对比结果。

Intern-S1-miniQwen3-8BGLM-4.1VMiMo-VL-7B-RL-2508
GeneralMMLU-Pro74.7873.757.173.93
MMMU72.33N/A69.970.4
MMStar65.2N/A71.572.9
GPQA65.156250.3260.35
AIME202484.587636.272.6
AIME20258067.33264.4
MathVision51.41N/A53.954.5
MathVista70.3N/A80.779.4
IFEval81.158571.5371.4
ScientificSFE35.84N/A43.243.9
Physics28.76N/A28.328.2
SmolInstruct32.217.618.116.11
ChemBench76.4761.156.266.78
MatBench61.5545.2454.346.9
MicroVQA56.62N/A50.250.96
ProteinLMBench58.4759.158.359.8
MSEarthMCQ58.12N/A50.347.3
XLRS-Bench51.63N/A49.812.29

我们使用OpenCompass和VLMEvalkit来评估所有模型。

快速开始

采样参数

我们推荐使用以下超参数以获得更好的结果

top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.8

Transformers

以下提供演示代码,展示如何基于文本和多模态输入生成内容。

请使用 transformers>=4.55.2 以确保模型正常运行。

文本输入
from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "text", "text": "tell me about an interesting physical phenomenon."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
图片输入
from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},{"type": "text", "text": "Please describe the image explicitly."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)
视频输入

请确保通过 pip install decord 安装了 decord 视频解码库。为避免内存不足,请安装 flash_attention 并使用至少 2 块 GPU。

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "video","url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",},{"type": "text", "text": "What type of shot is the man performing?"},],}]inputs = processor.apply_chat_template(messages,return_tensors="pt",add_generation_prompt=True,video_load_backend="decord",tokenize=True,return_dict=True,).to(model.device, dtype=torch.float16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

服务

部署Intern-S1系列模型的最低硬件要求为:

ModelA100(GPUs)H800(GPUs)H100(GPUs)H200(GPUs)
internlm/Intern-S1-mini1111
internlm/Intern-S1-mini-FP8-111

您可以使用以下其中一个LLM推理框架来创建一个与OpenAI兼容的服务器:

lmdeploy(>=0.9.2)
lmdeploy serve api_server internlm/Intern-S1-mini --reasoning-parser intern-s1 --tool-call-parser intern-s1
vllm
vllm serve internlm/Intern-S1-mini --trust-remote-code
sglang
python3 -m sglang.launch_server \--model-path internlm/Intern-S1-mini \--trust-remote-code \--grammar-backend none
本地部署的ollama:
# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1-mini
# run model
ollama run internlm/interns1-mini
# then use openai client to call on http://localhost:11434/v1

进阶用法

工具调用

目前,许多大语言模型(LLMs)都具备工具调用这一强大功能,使其能够通过调用外部工具或API来扩展能力。借助这一特性,模型可以实现诸如获取实时信息、运行代码或调用其他应用程序中的函数等任务。

对开发者而言,一个显著优势是越来越多的开源LLMs已兼容OpenAI API标准。这意味着您可以使用与OpenAI库相同的语法结构,在这些开源模型上实现工具调用功能。因此,本教程演示的代码具有通用性——不仅适用于OpenAI模型,也兼容任何遵循相同接口标准的模型。

下面我们通过具体代码示例(基于lmdeploy api server)来演示如何利用工具调用获取最新天气预报。

      
from openai import OpenAI
import jsondef get_current_temperature(location: str, unit: str = "celsius"):"""Get current temperature at a location.Args:location: The location to get the temperature for, in the format "City, State, Country".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, and the unit in a dict"""return {"temperature": 26.1,"location": location,"unit": unit,}def get_temperature_date(location: str, date: str, unit: str = "celsius"):"""Get temperature at a location and date.Args:location: The location to get the temperature for, in the format "City, State, Country".date: The date to get the temperature for, in the format "Year-Month-Day".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, the date and the unit in a dict"""return {"temperature": 25.9,"location": location,"date": date,"unit": unit,}def get_function_by_name(name):if name == "get_current_temperature":return get_current_temperatureif name == "get_temperature_date":return get_temperature_datetools = [{'type': 'function','function': {'name': 'get_current_temperature','description': 'Get current temperature at a location.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location']}}
}, {'type': 'function','function': {'name': 'get_temperature_date','description': 'Get temperature at a location and date.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'date': {'type': 'string','description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location','date']}}
}]messages = [{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(model=model_name,messages=messages,max_tokens=32768,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)for tool_call in response.choices[0].message.tool_calls:tool_call_args = json.loads(tool_call.function.arguments)tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)messages.append({'role': 'tool','name': tool_call.function.name,'content': tool_call_result,'tool_call_id': tool_call.id})response = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message.content)

思维与非思维模式切换

Intern-S1-mini 默认启用思维模式,以增强模型的推理能力,从而生成更高质量的回复。如需禁用该功能,可在 tokenizer.apply_chat_template 中设置 enable_thinking=False

text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True,enable_thinking=False  # think mode indicator
)

通过LMDeploy服务Intern-S1-mini模型时,你可以在请求中动态调整enable_thinking参数来控制思考模式。

from openai import OpenAI
import jsonmessages = [
{'role': 'user','content': 'who are you'
}, {'role': 'assistant','content': 'I am an AI'
}, {'role': 'user','content': 'AGI is?'
}]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].idresponse = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,max_tokens=2048,extra_body={"enable_thinking": False,}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))

对于vllm和sglang用户,请通过以下方式配置:

extra_body={"chat_template_kwargs": {"enable_thinking": False}
}
http://www.dtcms.com/a/343972.html

相关文章:

  • WPF MVVM入门系列教程(TabControl绑定到列表并单独指定每一页内容)
  • 【nl2sql综述】2025最新综述解读
  • RAG学习(五)——查询构建、Text2SQL、查询重构与分发
  • Docker 部署 Microsoft SQL Server 指南
  • 第10课:性能优化
  • 如何将照片从iPhone传输到Mac?
  • 如何将文件从 iPad 转移到 iPhone 16/15
  • Node.js 开发 JavaScript SDK 包的完整指南(AI)
  • Cloudflare + nginx 限制ip访问的几种方式(白嫖cloudflare的ip数据库)
  • 数据分类分级的概念、标准解读及实现路径
  • 新零售“实—虚—合”逻辑下定制开发开源AI智能名片S2B2C商城小程序的机遇与演进
  • TCP/UDP详解(一)
  • 高并发的 Spring Boot Web 项目注意点
  • HTTP代理与SOCKS代理的区别、应用场景与选择指南
  • Figma 开源替代品 Penpot 安装与使用
  • 要区分一张图片中的网状图(如网格结构或规则纹理)和噪点(随机分布的干扰像素),比如电路的方法 计算机视觉
  • Unreal Engine ClassName Rule
  • HTTP接口鉴权方式
  • Java面试实战系列【并发篇】- CompletableFuture异步编程实战
  • Node.js中Express框架入门教程
  • vue/react使用h5player对接海康ws视频流实时播放,监控回放
  • 快速入门Vue3——初体验
  • CS创世SD NAND在北京君正平台和瑞芯微RK平台的应用
  • 高压、高功率时代,飞机电气系统如何保障安全?
  • 安全运维过程文档体系规范
  • 2025软件供应链安全技术路线未来趋势预测
  • Docker的安装
  • Docker Hub 镜像一键同步至阿里云 ACR
  • 如何在Windows 10/11家庭版安装组策略编辑器
  • nanoGPT 部署