FastDeploy2.0:报qwen2.embed_tokens.weight
一、现象
DeepSeek-R1-Distill-Qwen-1.5B:通过modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B下载
执行如下命令:
python -m fastdeploy.entrypoints.openai.api_server --model /workspace/models/DeepSeek-R1-Distill-Qwen-1.5B --port 8180 --metrics-port 8181 --engine-worker-queue-port 8182 --max-model-len 8192 --max-num-seqs 1 --reasoning-parser qwen3
报如下错误:
[ INFO] - Starting to load model Qwen2ForCausalLM
[2025-08-04 10:59:57,546] [ INFO] - Attention is running in cache kv bfloat16 mode
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 730, in <module>
run_worker_proc()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 711, in run_worker_proc
worker_proc.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 409, in load_model
self.worker.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_worker.py", line 160, in load_model
self.model_runner.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_model_runner.py", line 725, in load_model
self.model = get_model_from_loader(fd_config=self.fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 54, in get_model_from_loader
model = model_loader.load_model(fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 121, in load_model
model.set_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/decorator.py", line 235, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 396, in _decorate_function
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 333, in set_state_dict
self.qwen2.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 264, in load_state_dict
self.embed_tokens.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/layers/embeddings.py", line 111, in load_state_dict
get_tensor(state_dict.pop(self.prefix + ".weight")).astype(
KeyError: 'qwen2.embed_tokens.weight'
为什么会报qwen2.embed_tokens.weight这个错误呢,这个模型是qwen2架构,通过vllm及sglang发布都正常。
通过执行如下代码,获取model.safetensors中的信息。
from safetensors import safe_opendef write_tensor_info_to_file(safetensors_file, output_file):with open(output_file, 'w') as f_out:f_out.write(f"正在处理文件: {safetensors_file}\n\n")with safe_open(safetensors_file, framework="pt") as f: tensor_names = f.keys()f_out.write(f"找到 {len(tensor_names)} 个张量:\n")for i, key in enumerate(tensor_names, 1):tensor = f.get_tensor(key) # 返回 torch.Tensorshape = tensor.shapedtype = tensor.dtypef_out.write(f"{i:2d}. {key}\n")f_out.write(f" 形状: {shape}, 数据类型: {dtype}, 大小: {tensor.numel():,}\n")f_out.write("\n")# 调用
safetensors_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors"
output_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model_summary.txt"write_tensor_info_to_file(safetensors_file_path, output_file_path)
print("完成!请查看输出文件:", output_file_path)
共获取339个张量,里面确实没有qwen2.embed_tokens.weight,有model.embed_tokens.weight。那这是什么原因呢,这个张量还很重要。
model.embed_tokens.weight 是自然语言处理(NLP)模型中一个非常重要的参数,尤其是在基于Transformer架构的模型中,如Qwen、BERT、GPT等。这个权重矩阵主要负责将输入的词汇(token)转换为模型可以处理的向量形式,即进行词嵌入(embedding)
二、解决方案
百度的aistudio下载模型,就可以。
aistudio download --model PaddleNLP/DeepSeek-R1-Distill-Qwen-1.5B --local_dir d:\DeepSeek-R1-Distill-Qwen-1.5B
通过上面的代码,读取model.safetensors中的信息,发现确实有qwen2.embed_tokens.weight,也能正常发布。
正在处理文件: /work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors
找到 339 个张量:
1. lm_head.weight
形状: torch.Size([1536, 151936]), 数据类型: torch.uint16, 大小: 233,373,6962. qwen2.embed_tokens.weight
形状: torch.Size([151936, 1536]), 数据类型: torch.uint16, 大小: 233,373,6963. qwen2.layers.0.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,5364. qwen2.layers.0.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,5605. qwen2.layers.0.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,5606. qwen2.layers.0.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,5607. qwen2.layers.0.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,5368. qwen2.layers.0.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 2569. qwen2.layers.0.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21610. qwen2.layers.0.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29611. qwen2.layers.0.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53612. qwen2.layers.0.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29613. qwen2.layers.0.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25614. qwen2.layers.0.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21615. qwen2.layers.1.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53616. qwen2.layers.1.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56017. qwen2.layers.1.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56018. qwen2.layers.1.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56019. qwen2.layers.1.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53620. qwen2.layers.1.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25621. qwen2.layers.1.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21622. qwen2.layers.1.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29623. qwen2.layers.1.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53624. qwen2.layers.1.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29625. qwen2.layers.1.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25626. qwen2.layers.1.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21627. qwen2.layers.10.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53628. qwen2.layers.10.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56029. qwen2.layers.10.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56030. qwen2.layers.10.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56031. qwen2.layers.10.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53632. qwen2.layers.10.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25633. qwen2.layers.10.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21634. qwen2.layers.10.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29635. qwen2.layers.10.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53636. qwen2.layers.10.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29637. qwen2.layers.10.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25638. qwen2.layers.10.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21639. qwen2.layers.11.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53640. qwen2.layers.11.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56041. qwen2.layers.11.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56042. qwen2.layers.11.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56043. qwen2.layers.11.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53644. qwen2.layers.11.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25645. qwen2.layers.11.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21646. qwen2.layers.11.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29647. qwen2.layers.11.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53648. qwen2.layers.11.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29649. qwen2.layers.11.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25650. qwen2.layers.11.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21651. qwen2.layers.12.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53652. qwen2.layers.12.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56053. qwen2.layers.12.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56054. qwen2.layers.12.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56055. qwen2.layers.12.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53656. qwen2.layers.12.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25657. qwen2.layers.12.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21658. qwen2.layers.12.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29659. qwen2.layers.12.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53660. qwen2.layers.12.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29661. qwen2.layers.12.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25662. qwen2.layers.12.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21663. qwen2.layers.13.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53664. qwen2.layers.13.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56065. qwen2.layers.13.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56066. qwen2.layers.13.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56067. qwen2.layers.13.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53668. qwen2.layers.13.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25669. qwen2.layers.13.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21670. qwen2.layers.13.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29671. qwen2.layers.13.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53672. qwen2.layers.13.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29673. qwen2.layers.13.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25674. qwen2.layers.13.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21675. qwen2.layers.14.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53676. qwen2.layers.14.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56077. qwen2.layers.14.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56078. qwen2.layers.14.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56079. qwen2.layers.14.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53680. qwen2.layers.14.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25681. qwen2.layers.14.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21682. qwen2.layers.14.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29683. qwen2.layers.14.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53684. qwen2.layers.14.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29685. qwen2.layers.14.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25686. qwen2.layers.14.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21687. qwen2.layers.15.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53688. qwen2.layers.15.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,56089. qwen2.layers.15.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56090. qwen2.layers.15.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,56091. qwen2.layers.15.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53692. qwen2.layers.15.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25693. qwen2.layers.15.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21694. qwen2.layers.15.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29695. qwen2.layers.15.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,53696. qwen2.layers.15.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,29697. qwen2.layers.15.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 25698. qwen2.layers.15.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,21699. qwen2.layers.16.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536100. qwen2.layers.16.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560101. qwen2.layers.16.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560102. qwen2.layers.16.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560103. qwen2.layers.16.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536104. qwen2.layers.16.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256105. qwen2.layers.16.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216106. qwen2.layers.16.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296107. qwen2.layers.16.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536108. qwen2.layers.16.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296109. qwen2.layers.16.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256110. qwen2.layers.16.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216111. qwen2.layers.17.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536112. qwen2.layers.17.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560113. qwen2.layers.17.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560114. qwen2.layers.17.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560115. qwen2.layers.17.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536116. qwen2.layers.17.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256117. qwen2.layers.17.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216118. qwen2.layers.17.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296119. qwen2.layers.17.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536120. qwen2.layers.17.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296121. qwen2.layers.17.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256122. qwen2.layers.17.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216123. qwen2.layers.18.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536124. qwen2.layers.18.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560125. qwen2.layers.18.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560126. qwen2.layers.18.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560127. qwen2.layers.18.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536128. qwen2.layers.18.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256129. qwen2.layers.18.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216130. qwen2.layers.18.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296131. qwen2.layers.18.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536132. qwen2.layers.18.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296133. qwen2.layers.18.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256134. qwen2.layers.18.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216135. qwen2.layers.19.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536136. qwen2.layers.19.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560137. qwen2.layers.19.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560138. qwen2.layers.19.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560139. qwen2.layers.19.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536140. qwen2.layers.19.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256141. qwen2.layers.19.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216142. qwen2.layers.19.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296143. qwen2.layers.19.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536144. qwen2.layers.19.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296145. qwen2.layers.19.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256146. qwen2.layers.19.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216147. qwen2.layers.2.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536148. qwen2.layers.2.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560149. qwen2.layers.2.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560150. qwen2.layers.2.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560151. qwen2.layers.2.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536152. qwen2.layers.2.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256153. qwen2.layers.2.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216154. qwen2.layers.2.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296155. qwen2.layers.2.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536156. qwen2.layers.2.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296157. qwen2.layers.2.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256158. qwen2.layers.2.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216159. qwen2.layers.20.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536160. qwen2.layers.20.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560161. qwen2.layers.20.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560162. qwen2.layers.20.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560163. qwen2.layers.20.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536164. qwen2.layers.20.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256165. qwen2.layers.20.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216166. qwen2.layers.20.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296167. qwen2.layers.20.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536168. qwen2.layers.20.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296169. qwen2.layers.20.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256170. qwen2.layers.20.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216171. qwen2.layers.21.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536172. qwen2.layers.21.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560173. qwen2.layers.21.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560174. qwen2.layers.21.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560175. qwen2.layers.21.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536176. qwen2.layers.21.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256177. qwen2.layers.21.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216178. qwen2.layers.21.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296179. qwen2.layers.21.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536180. qwen2.layers.21.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296181. qwen2.layers.21.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256182. qwen2.layers.21.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216183. qwen2.layers.22.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536184. qwen2.layers.22.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560185. qwen2.layers.22.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560186. qwen2.layers.22.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560187. qwen2.layers.22.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536188. qwen2.layers.22.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256189. qwen2.layers.22.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216190. qwen2.layers.22.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296191. qwen2.layers.22.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536192. qwen2.layers.22.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296193. qwen2.layers.22.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256194. qwen2.layers.22.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216195. qwen2.layers.23.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536196. qwen2.layers.23.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560197. qwen2.layers.23.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560198. qwen2.layers.23.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560199. qwen2.layers.23.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536200. qwen2.layers.23.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256201. qwen2.layers.23.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216202. qwen2.layers.23.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296203. qwen2.layers.23.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536204. qwen2.layers.23.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296205. qwen2.layers.23.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256206. qwen2.layers.23.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216207. qwen2.layers.24.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536208. qwen2.layers.24.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560209. qwen2.layers.24.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560210. qwen2.layers.24.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560211. qwen2.layers.24.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536212. qwen2.layers.24.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256213. qwen2.layers.24.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216214. qwen2.layers.24.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296215. qwen2.layers.24.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536216. qwen2.layers.24.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296217. qwen2.layers.24.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256218. qwen2.layers.24.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216219. qwen2.layers.25.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536220. qwen2.layers.25.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560221. qwen2.layers.25.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560222. qwen2.layers.25.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560223. qwen2.layers.25.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536224. qwen2.layers.25.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256225. qwen2.layers.25.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216226. qwen2.layers.25.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296227. qwen2.layers.25.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536228. qwen2.layers.25.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296229. qwen2.layers.25.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256230. qwen2.layers.25.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216231. qwen2.layers.26.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536232. qwen2.layers.26.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560233. qwen2.layers.26.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560234. qwen2.layers.26.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560235. qwen2.layers.26.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536236. qwen2.layers.26.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256237. qwen2.layers.26.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216238. qwen2.layers.26.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296239. qwen2.layers.26.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536240. qwen2.layers.26.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296241. qwen2.layers.26.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256242. qwen2.layers.26.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216243. qwen2.layers.27.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536244. qwen2.layers.27.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560245. qwen2.layers.27.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560246. qwen2.layers.27.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560247. qwen2.layers.27.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536248. qwen2.layers.27.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256249. qwen2.layers.27.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216250. qwen2.layers.27.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296251. qwen2.layers.27.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536252. qwen2.layers.27.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296253. qwen2.layers.27.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256254. qwen2.layers.27.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216255. qwen2.layers.3.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536256. qwen2.layers.3.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560257. qwen2.layers.3.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560258. qwen2.layers.3.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560259. qwen2.layers.3.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536260. qwen2.layers.3.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256261. qwen2.layers.3.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216262. qwen2.layers.3.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296263. qwen2.layers.3.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536264. qwen2.layers.3.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296265. qwen2.layers.3.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256266. qwen2.layers.3.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216267. qwen2.layers.4.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536268. qwen2.layers.4.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560269. qwen2.layers.4.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560270. qwen2.layers.4.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560271. qwen2.layers.4.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536272. qwen2.layers.4.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256273. qwen2.layers.4.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216274. qwen2.layers.4.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296275. qwen2.layers.4.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536276. qwen2.layers.4.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296277. qwen2.layers.4.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256278. qwen2.layers.4.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216279. qwen2.layers.5.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536280. qwen2.layers.5.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560281. qwen2.layers.5.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560282. qwen2.layers.5.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560283. qwen2.layers.5.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536284. qwen2.layers.5.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256285. qwen2.layers.5.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216286. qwen2.layers.5.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296287. qwen2.layers.5.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536288. qwen2.layers.5.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296289. qwen2.layers.5.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256290. qwen2.layers.5.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216291. qwen2.layers.6.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536292. qwen2.layers.6.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560293. qwen2.layers.6.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560294. qwen2.layers.6.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560295. qwen2.layers.6.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536296. qwen2.layers.6.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256297. qwen2.layers.6.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216298. qwen2.layers.6.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296299. qwen2.layers.6.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536300. qwen2.layers.6.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296301. qwen2.layers.6.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256302. qwen2.layers.6.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216303. qwen2.layers.7.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536304. qwen2.layers.7.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560305. qwen2.layers.7.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560306. qwen2.layers.7.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560307. qwen2.layers.7.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536308. qwen2.layers.7.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256309. qwen2.layers.7.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216310. qwen2.layers.7.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296311. qwen2.layers.7.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536312. qwen2.layers.7.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296313. qwen2.layers.7.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256314. qwen2.layers.7.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216315. qwen2.layers.8.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536316. qwen2.layers.8.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560317. qwen2.layers.8.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560318. qwen2.layers.8.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560319. qwen2.layers.8.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536320. qwen2.layers.8.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256321. qwen2.layers.8.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216322. qwen2.layers.8.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296323. qwen2.layers.8.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536324. qwen2.layers.8.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296325. qwen2.layers.8.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256326. qwen2.layers.8.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216327. qwen2.layers.9.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536328. qwen2.layers.9.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560329. qwen2.layers.9.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560330. qwen2.layers.9.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560331. qwen2.layers.9.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536332. qwen2.layers.9.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256333. qwen2.layers.9.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216334. qwen2.layers.9.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296335. qwen2.layers.9.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536336. qwen2.layers.9.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296337. qwen2.layers.9.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256338. qwen2.layers.9.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216339. qwen2.norm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536