当前位置: 首页 > news >正文

Qwen3-30B-A3B 模型解析

模型介绍

Qwen3-30B-A3B 是阿里巴巴通义千问团队推出的 Qwen3 系列中的一款旗舰级专家混合(Mixture-of-Experts, MoE)大语言模型。它代表了当前开源模型在架构创新与性能表现上的最新高度,旨在为复杂推理、高效对话及智能代理任务提供一个强大而高效的基石。

  1. 双模式无缝切换(核心突破)
    Qwen3-30B-A3B 独特地内嵌了“思考模式”和“非思考模式”的智能切换能力。

    • 思考模式(Think Mode):在此模式下,模型会进行更深思熟虑的链式推理,尤其擅长解决复杂的数学问题、代码生成与调试、以及需要多步逻辑推理的任务,性能超越前代专门化的推理模型。
    • 非思考模式(Non-Think Mode):在此模式下,模型响应迅速流畅,专注于提供高效、自然的通用对话体验,在创意写作、多轮闲聊和指令遵循方面表现卓越。
  2. 卓越的人类偏好对齐
    经过精心设计的后训练(Post-training),模型在人类偏好对齐方面表现突出。它能够生成更自然、富有创造力且贴合用户意图的文本,在角色扮演、创意写作和沉浸式对话中提供出色的用户体验。

  3. 强大的智能代理(Agent)能力
    无论是思考模式还是非思考模式,模型都展现出顶尖的工具调用与任务规划能力。它能精准地理解用户指令,自主选择并调用外部工具(如API、计算器、搜索引擎),完成复杂的、多步骤的自动化任务,在开源智能体领域达到了领先水平。

  4. 全面的多语言支持
    模型在100多种语言和方言上进行了深度优化,不仅具备优秀的理解和生成能力,更在跨语言翻译和多语言指令遵循方面表现强劲,真正服务于全球化的应用场景。

模型性能

在这里插入图片描述

模型加载

from modelscope import AutoModelForCausalLM, AutoTokenizermodel_name = "Qwen/Qwen3-30B-A3B"# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name,torch_dtype="auto",device_map="auto"
)
/home/six/Zhou/test_source/gpt-oss-unsloth/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.htmlfrom .autonotebook import tqdm as notebook_tqdmDownloading Model from https://www.modelscope.cn to directory: /home/six/.cache/modelscope/hub/models/Qwen/Qwen3-30B-A3B
Downloading Model from https://www.modelscope.cn to directory: /home/six/.cache/modelscope/hub/models/Qwen/Qwen3-30B-A3BLoading checkpoint shards: 100%|██████████| 16/16 [00:51<00:00,  3.24s/it]

模型结构

model
Qwen3MoeForCausalLM((model): Qwen3MoeModel((embed_tokens): Embedding(151936, 2048)(layers): ModuleList((0-47): 48 x Qwen3MoeDecoderLayer((self_attn): Qwen3MoeAttention((q_proj): Linear(in_features=2048, out_features=4096, bias=False)(k_proj): Linear(in_features=2048, out_features=512, bias=False)(v_proj): Linear(in_features=2048, out_features=512, bias=False)(o_proj): Linear(in_features=4096, out_features=2048, bias=False)(q_norm): Qwen3MoeRMSNorm((128,), eps=1e-06)(k_norm): Qwen3MoeRMSNorm((128,), eps=1e-06))(mlp): Qwen3MoeSparseMoeBlock((gate): Linear(in_features=2048, out_features=128, bias=False)(experts): ModuleList((0-127): 128 x Qwen3MoeMLP((gate_proj): Linear(in_features=2048, out_features=768, bias=False)(up_proj): Linear(in_features=2048, out_features=768, bias=False)(down_proj): Linear(in_features=768, out_features=2048, bias=False)(act_fn): SiLU())))(input_layernorm): Qwen3MoeRMSNorm((2048,), eps=1e-06)(post_attention_layernorm): Qwen3MoeRMSNorm((2048,), eps=1e-06)))(norm): Qwen3MoeRMSNorm((2048,), eps=1e-06)(rotary_emb): Qwen3MoeRotaryEmbedding())(lm_head): Linear(in_features=2048, out_features=151936, bias=False)
)

请添加图片描述

模型配置

model.config
Qwen3MoeConfig {"architectures": ["Qwen3MoeForCausalLM"],"attention_bias": false,"attention_dropout": 0.0,"bos_token_id": 151643,"decoder_sparse_step": 1,"eos_token_id": 151645,"head_dim": 128,"hidden_act": "silu","hidden_size": 2048,"initializer_range": 0.02,"intermediate_size": 6144,"max_position_embeddings": 40960,"max_window_layers": 48,"mlp_only_layers": [],"model_type": "qwen3_moe","moe_intermediate_size": 768,"norm_topk_prob": true,"num_attention_heads": 32,"num_experts": 128,"num_experts_per_tok": 8,"num_hidden_layers": 48,"num_key_value_heads": 4,"output_router_logits": false,"rms_norm_eps": 1e-06,"rope_scaling": null,"rope_theta": 1000000.0,"router_aux_loss_coef": 0.001,"sliding_window": null,"tie_word_embeddings": false,"torch_dtype": "bfloat16","transformers_version": "4.55.2","use_cache": true,"use_sliding_window": false,"vocab_size": 151936
}

模型使用

# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True,enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)# conduct text completion
generated_ids = model.generate(**model_inputs,max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() # parsing thinking content
try:# rindex finding 151668 (</think>)index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:index = 0thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")print("thinking content:", thinking_content)
print("content:", content)
thinking content: <think>
Okay, the user is asking for a short introduction to large language models. Let me start by defining what they are. I should mention that they're AI systems trained on vast amounts of text data. Then, I need to explain their purpose, like generating human-like text and understanding language.I should highlight key features such as their size, with billions of parameters, and how they use deep learning, maybe mention transformers. It's important to note their applications—like chatbots, content creation, and data analysis. Also, touch on their capabilities like multilingual support and reasoning. But I shouldn't forget to mention the challenges, like computational costs and potential biases. Keep it concise but informative. Let me structure that into a few paragraphs without getting too technical. Make sure it's easy to understand for someone who might not be familiar with AI terms.
</think>
content: A **large language model (LLM)** is an advanced artificial intelligence system designed to understand, generate, and interact with human language. Trained on vast amounts of text data from the internet, books, and other sources, these models learn patterns, grammar, and context to produce coherent responses, answer questions, write essays, code, or even engage in conversations. Powered by deep learning architectures like transformers, LLMs can handle complex tasks such as translation, summarization, and reasoning. Their scale—often involving billions of parameters—enables them to capture nuanced linguistic structures and adapt to diverse topics. While they excel at mimicking human-like text, they lack true understanding and rely on statistical patterns. LLMs are widely used in applications like virtual assistants, content creation, and customer service, but they also raise ethical and technical challenges, such as bias, misinformation, and computational costs.
http://www.dtcms.com/a/362826.html

相关文章:

  • 【LeetCode牛客数据结构】单链表的应用
  • C语言(长期更新)第12讲:指针二详解
  • 【嵌入式电机控制#进阶6】三段启动法
  • 怎么为服务器设置或重置服务器密码?
  • 【Vue2 ✨】Vue2 入门之旅(九):Vue Router 入门
  • JetBrains 2025 全家桶 11合1 Windows直装(含 IDEA PyCharm、WebStorm、DataSpell、DataGrip等
  • [密码学实战]智能密码钥匙SKF库软实现(四十六)
  • LabVIEW应急柴油发电机组诊断装置
  • LabVIEW振动信号积分处理
  • 【设计模式】通俗讲解设计模式的七大原则
  • 【设计模式】从游戏角度开始了解设计模式 --- 创建型模式(一)
  • Python OpenCV图像处理与深度学习:Python OpenCV性能优化与高效图像处理
  • VGG改进(7):基于Spatial Attention的性能优化
  • 从“叠加”到“重叠”:Overlay 与 Overlap 双引擎驱动技术性能优化
  • Spring MVC + JSP 项目的配置流程,适合传统 Java Web 项目开发
  • 【MySQL】初识数据库基础
  • RAG-检索进阶
  • 【一张图看懂Kafka消息队列架构】
  • 【C++】编写通用模板代码的重要技巧:T()
  • 与后端对话:在React中优雅地请求API数据 (Fetch/Axios)
  • 基于STM32的智能语音浴缸设计
  • 工业视觉光源选色指南:白光通用、蓝光显瑕疵、红光能穿透,看完直接用
  • 推荐一个论文阅读工具ivySCI
  • C++内存管理,模板初阶(泛型编程)
  • 项目组文档标准不一致的原因有哪些
  • 设计模式:命令模式(Command Pattern)
  • 测试覆盖率不够高?这些技巧让你的FastAPI测试无懈可击!
  • java-设计模式-3-创建型模式-原型
  • GPT-5 技术应用指南:从底层能力到工业级落地
  • 零基础Linux操作基础小白快速掌握Shell脚本bash的配置文件