当前位置：首页 > news >正文

上海人工智能实验室开源基于Intern-S1同等技术的轻量化开源多模态推理模型

news 2025/8/23 6:10:06

在这里插入图片描述

简介

我们推出Intern-S1-mini——一款基于Intern-S1同等技术的轻量化开源多模态推理模型。该模型以80亿参数稠密语言模型（Qwen3）和3亿参数视觉编码器（InternViT）为底座，在5万亿token的多模态数据（含超2.5万亿科学领域token）上进行了持续预训练。这使得模型在保持通用能力的同时，能出色处理化学结构解析、蛋白质序列理解、化合物合成路线规划等专业科学任务，成为现实科研应用中得力的智能助手。

特性

在语言与视觉推理基准测试（尤其是科学任务）中表现优异。
基于5万亿token数据集的持续预训练，其中超50%为专业科学数据，具备深厚的领域知识沉淀。
动态分词器可实现分子式与蛋白质序列的原生解析。

性能表现

我们在包括通用数据集和科学数据集在内的多种基准测试上评估了Intern-S1-mini模型。以下是其与近期视觉语言模型及大语言模型的性能对比结果。

		Intern-S1-mini	Qwen3-8B	GLM-4.1V	MiMo-VL-7B-RL-2508
General	MMLU-Pro	74.78	73.7	57.1	73.93
	MMMU	72.33	N/A	69.9	70.4
	MMStar	65.2	N/A	71.5	72.9
	GPQA	65.15	62	50.32	60.35
	AIME2024	84.58	76	36.2	72.6
	AIME2025	80	67.3	32	64.4
	MathVision	51.41	N/A	53.9	54.5
	MathVista	70.3	N/A	80.7	79.4
	IFEval	81.15	85	71.53	71.4

Scientific	SFE	35.84	N/A	43.2	43.9
	Physics	28.76	N/A	28.3	28.2
	SmolInstruct	32.2	17.6	18.1	16.11
	ChemBench	76.47	61.1	56.2	66.78
	MatBench	61.55	45.24	54.3	46.9
	MicroVQA	56.62	N/A	50.2	50.96
	ProteinLMBench	58.47	59.1	58.3	59.8
	MSEarthMCQ	58.12	N/A	50.3	47.3
	XLRS-Bench	51.63	N/A	49.8	12.29

我们使用OpenCompass和VLMEvalkit来评估所有模型。

快速开始

采样参数

我们推荐使用以下超参数以获得更好的结果

top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.8

Transformers

以下提供演示代码，展示如何基于文本和多模态输入生成内容。

请使用 transformers>=4.55.2 以确保模型正常运行。

文本输入

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "text", "text": "tell me about an interesting physical phenomenon."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

图片输入

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},{"type": "text", "text": "Please describe the image explicitly."},],}
]inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

视频输入

请确保通过 pip install decord 安装了 decord 视频解码库。为避免内存不足，请安装 flash_attention 并使用至少 2 块 GPU。

from transformers import AutoProcessor, AutoModelForCausalLM
import torchmodel_name = "internlm/Intern-S1-mini"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)messages = [{"role": "user","content": [{"type": "video","url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",},{"type": "text", "text": "What type of shot is the man performing?"},],}]inputs = processor.apply_chat_template(messages,return_tensors="pt",add_generation_prompt=True,video_load_backend="decord",tokenize=True,return_dict=True,).to(model.device, dtype=torch.float16)generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)

服务

部署Intern-S1系列模型的最低硬件要求为：

Model	A100(GPUs)	H800(GPUs)	H100(GPUs)	H200(GPUs)
internlm/Intern-S1-mini	1	1	1	1
internlm/Intern-S1-mini-FP8	-	1	1	1

您可以使用以下其中一个LLM推理框架来创建一个与OpenAI兼容的服务器：

lmdeploy(>=0.9.2)

lmdeploy serve api_server internlm/Intern-S1-mini --reasoning-parser intern-s1 --tool-call-parser intern-s1

vllm

vllm serve internlm/Intern-S1-mini --trust-remote-code

sglang

python3 -m sglang.launch_server \--model-path internlm/Intern-S1-mini \--trust-remote-code \--grammar-backend none

本地部署的ollama:

# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1-mini
# run model
ollama run internlm/interns1-mini
# then use openai client to call on http://localhost:11434/v1

进阶用法

工具调用

目前，许多大语言模型（LLMs）都具备工具调用这一强大功能，使其能够通过调用外部工具或API来扩展能力。借助这一特性，模型可以实现诸如获取实时信息、运行代码或调用其他应用程序中的函数等任务。

对开发者而言，一个显著优势是越来越多的开源LLMs已兼容OpenAI API标准。这意味着您可以使用与OpenAI库相同的语法结构，在这些开源模型上实现工具调用功能。因此，本教程演示的代码具有通用性——不仅适用于OpenAI模型，也兼容任何遵循相同接口标准的模型。

下面我们通过具体代码示例（基于lmdeploy api server）来演示如何利用工具调用获取最新天气预报。

      
from openai import OpenAI
import jsondef get_current_temperature(location: str, unit: str = "celsius"):"""Get current temperature at a location.Args:location: The location to get the temperature for, in the format "City, State, Country".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, and the unit in a dict"""return {"temperature": 26.1,"location": location,"unit": unit,}def get_temperature_date(location: str, date: str, unit: str = "celsius"):"""Get temperature at a location and date.Args:location: The location to get the temperature for, in the format "City, State, Country".date: The date to get the temperature for, in the format "Year-Month-Day".unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])Returns:the temperature, the location, the date and the unit in a dict"""return {"temperature": 25.9,"location": location,"date": date,"unit": unit,}def get_function_by_name(name):if name == "get_current_temperature":return get_current_temperatureif name == "get_temperature_date":return get_temperature_datetools = [{'type': 'function','function': {'name': 'get_current_temperature','description': 'Get current temperature at a location.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location']}}
}, {'type': 'function','function': {'name': 'get_temperature_date','description': 'Get temperature at a location and date.','parameters': {'type': 'object','properties': {'location': {'type': 'string','description': 'The location to get the temperature for, in the format \'City, State, Country\'.'},'date': {'type': 'string','description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'},'unit': {'type': 'string','enum': ['celsius','fahrenheit'],'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'}},'required': ['location','date']}}
}]messages = [{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(model=model_name,messages=messages,max_tokens=32768,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)for tool_call in response.choices[0].message.tool_calls:tool_call_args = json.loads(tool_call.function.arguments)tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)messages.append({'role': 'tool','name': tool_call.function.name,'content': tool_call_result,'tool_call_id': tool_call.id})response = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,stream=False,extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),tools=tools)
print(response.choices[0].message.content)

思维与非思维模式切换

Intern-S1-mini 默认启用思维模式，以增强模型的推理能力，从而生成更高质量的回复。如需禁用该功能，可在 tokenizer.apply_chat_template 中设置 enable_thinking=False。

text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True,enable_thinking=False  # think mode indicator
)

通过LMDeploy服务Intern-S1-mini模型时，你可以在请求中动态调整enable_thinking参数来控制思考模式。

from openai import OpenAI
import jsonmessages = [
{'role': 'user','content': 'who are you'
}, {'role': 'assistant','content': 'I am an AI'
}, {'role': 'user','content': 'AGI is?'
}]openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(api_key=openai_api_key,base_url=openai_api_base,
)
model_name = client.models.list().data[0].idresponse = client.chat.completions.create(model=model_name,messages=messages,temperature=0.8,top_p=0.8,max_tokens=2048,extra_body={"enable_thinking": False,}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))

对于vllm和sglang用户，请通过以下方式配置：

extra_body={"chat_template_kwargs": {"enable_thinking": False}
}

查看全文

http://www.dtcms.com/a/343972.html

WPF MVVM入门系列教程（TabControl绑定到列表并单独指定每一页内容）

【nl2sql综述】2025最新综述解读

RAG学习（五）——查询构建、Text2SQL、查询重构与分发

Docker 部署 Microsoft SQL Server 指南

第10课：性能优化

如何将照片从iPhone传输到Mac?

如何将文件从 iPad 转移到 iPhone 16/15

Node.js 开发 JavaScript SDK 包的完整指南（AI）

Cloudflare + nginx 限制ip访问的几种方式（白嫖cloudflare的ip数据库）

数据分类分级的概念、标准解读及实现路径

新零售“实—虚—合”逻辑下定制开发开源AI智能名片S2B2C商城小程序的机遇与演进

TCP/UDP详解（一）

高并发的 Spring Boot Web 项目注意点

HTTP代理与SOCKS代理的区别、应用场景与选择指南

Figma 开源替代品 Penpot 安装与使用

要区分一张图片中的网状图（如网格结构或规则纹理）和噪点（随机分布的干扰像素）,比如电路的方法计算机视觉

Unreal Engine ClassName Rule

HTTP接口鉴权方式

Java面试实战系列【并发篇】- CompletableFuture异步编程实战

Node.js中Express框架入门教程

vue/react使用h5player对接海康ws视频流实时播放，监控回放

快速入门Vue3——初体验

CS创世SD NAND在北京君正平台和瑞芯微RK平台的应用

高压、高功率时代，飞机电气系统如何保障安全？

安全运维过程文档体系规范

2025软件供应链安全技术路线未来趋势预测

Docker的安装

Docker Hub 镜像一键同步至阿里云 ACR

如何在Windows 10/11家庭版安装组策略编辑器

nanoGPT 部署

简介

特性