当前位置：首页 > news >正文

FastDeploy2.0：报qwen2.embed_tokens.weight

news 2025/11/7 21:42:42

一、现象

DeepSeek-R1-Distill-Qwen-1.5B：通过modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B下载

执行如下命令：

python -m fastdeploy.entrypoints.openai.api_server --model /workspace/models/DeepSeek-R1-Distill-Qwen-1.5B --port 8180 --metrics-port 8181 --engine-worker-queue-port 8182 --max-model-len 8192 --max-num-seqs 1 --reasoning-parser qwen3

报如下错误：

[ INFO] - Starting to load model Qwen2ForCausalLM
[2025-08-04 10:59:57,546] [ INFO] - Attention is running in cache kv bfloat16 mode
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 730, in <module>
run_worker_proc()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 711, in run_worker_proc
worker_proc.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 409, in load_model
self.worker.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_worker.py", line 160, in load_model
self.model_runner.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_model_runner.py", line 725, in load_model
self.model = get_model_from_loader(fd_config=self.fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 54, in get_model_from_loader
model = model_loader.load_model(fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 121, in load_model
model.set_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/decorator.py", line 235, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 396, in _decorate_function
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 333, in set_state_dict
self.qwen2.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 264, in load_state_dict
self.embed_tokens.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/layers/embeddings.py", line 111, in load_state_dict
get_tensor(state_dict.pop(self.prefix + ".weight")).astype(
KeyError: 'qwen2.embed_tokens.weight'

为什么会报qwen2.embed_tokens.weight这个错误呢，这个模型是qwen2架构，通过vllm及sglang发布都正常。

通过执行如下代码，获取model.safetensors中的信息。

from safetensors import safe_opendef write_tensor_info_to_file(safetensors_file, output_file):with open(output_file, 'w') as f_out:f_out.write(f"正在处理文件: {safetensors_file}\n\n")with safe_open(safetensors_file, framework="pt") as f: tensor_names = f.keys()f_out.write(f"找到 {len(tensor_names)} 个张量:\n")for i, key in enumerate(tensor_names, 1):tensor = f.get_tensor(key)  # 返回 torch.Tensorshape = tensor.shapedtype = tensor.dtypef_out.write(f"{i:2d}. {key}\n")f_out.write(f"    形状: {shape}, 数据类型: {dtype}, 大小: {tensor.numel():,}\n")f_out.write("\n")# 调用
safetensors_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors"
output_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model_summary.txt"write_tensor_info_to_file(safetensors_file_path, output_file_path)
print("完成！请查看输出文件:", output_file_path)

共获取339个张量，里面确实没有qwen2.embed_tokens.weight，有model.embed_tokens.weight。那这是什么原因呢，这个张量还很重要。

model.embed_tokens.weight 是自然语言处理（NLP）模型中一个非常重要的参数，尤其是在基于Transformer架构的模型中，如Qwen、BERT、GPT等。这个权重矩阵主要负责将输入的词汇（token）转换为模型可以处理的向量形式，即进行词嵌入（embedding）

二、解决方案

百度的aistudio下载模型，就可以。

aistudio download --model PaddleNLP/DeepSeek-R1-Distill-Qwen-1.5B --local_dir d:\DeepSeek-R1-Distill-Qwen-1.5B

通过上面的代码，读取model.safetensors中的信息，发现确实有qwen2.embed_tokens.weight，也能正常发布。

正在处理文件: /work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors
找到 339 个张量:
1. lm_head.weight
形状: torch.Size([1536, 151936]), 数据类型: torch.uint16, 大小: 233,373,696
2. qwen2.embed_tokens.weight
形状: torch.Size([151936, 1536]), 数据类型: torch.uint16, 大小: 233,373,696
3. qwen2.layers.0.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
4. qwen2.layers.0.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
5. qwen2.layers.0.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
6. qwen2.layers.0.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
7. qwen2.layers.0.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
8. qwen2.layers.0.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
9. qwen2.layers.0.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
10. qwen2.layers.0.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
11. qwen2.layers.0.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
12. qwen2.layers.0.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
13. qwen2.layers.0.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
14. qwen2.layers.0.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
15. qwen2.layers.1.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
16. qwen2.layers.1.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
17. qwen2.layers.1.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
18. qwen2.layers.1.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
19. qwen2.layers.1.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
20. qwen2.layers.1.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
21. qwen2.layers.1.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
22. qwen2.layers.1.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
23. qwen2.layers.1.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
24. qwen2.layers.1.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
25. qwen2.layers.1.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
26. qwen2.layers.1.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
27. qwen2.layers.10.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
28. qwen2.layers.10.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
29. qwen2.layers.10.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
30. qwen2.layers.10.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
31. qwen2.layers.10.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
32. qwen2.layers.10.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
33. qwen2.layers.10.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
34. qwen2.layers.10.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
35. qwen2.layers.10.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
36. qwen2.layers.10.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
37. qwen2.layers.10.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
38. qwen2.layers.10.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
39. qwen2.layers.11.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
40. qwen2.layers.11.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
41. qwen2.layers.11.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
42. qwen2.layers.11.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
43. qwen2.layers.11.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
44. qwen2.layers.11.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
45. qwen2.layers.11.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
46. qwen2.layers.11.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
47. qwen2.layers.11.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
48. qwen2.layers.11.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
49. qwen2.layers.11.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
50. qwen2.layers.11.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
51. qwen2.layers.12.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
52. qwen2.layers.12.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
53. qwen2.layers.12.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
54. qwen2.layers.12.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
55. qwen2.layers.12.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
56. qwen2.layers.12.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
57. qwen2.layers.12.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
58. qwen2.layers.12.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
59. qwen2.layers.12.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
60. qwen2.layers.12.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
61. qwen2.layers.12.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
62. qwen2.layers.12.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
63. qwen2.layers.13.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
64. qwen2.layers.13.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
65. qwen2.layers.13.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
66. qwen2.layers.13.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
67. qwen2.layers.13.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
68. qwen2.layers.13.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
69. qwen2.layers.13.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
70. qwen2.layers.13.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
71. qwen2.layers.13.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
72. qwen2.layers.13.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
73. qwen2.layers.13.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
74. qwen2.layers.13.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
75. qwen2.layers.14.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
76. qwen2.layers.14.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
77. qwen2.layers.14.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
78. qwen2.layers.14.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
79. qwen2.layers.14.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
80. qwen2.layers.14.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
81. qwen2.layers.14.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
82. qwen2.layers.14.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
83. qwen2.layers.14.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
84. qwen2.layers.14.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
85. qwen2.layers.14.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
86. qwen2.layers.14.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
87. qwen2.layers.15.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
88. qwen2.layers.15.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
89. qwen2.layers.15.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
90. qwen2.layers.15.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
91. qwen2.layers.15.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
92. qwen2.layers.15.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
93. qwen2.layers.15.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
94. qwen2.layers.15.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
95. qwen2.layers.15.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
96. qwen2.layers.15.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
97. qwen2.layers.15.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
98. qwen2.layers.15.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
99. qwen2.layers.16.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
100. qwen2.layers.16.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
101. qwen2.layers.16.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
102. qwen2.layers.16.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
103. qwen2.layers.16.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
104. qwen2.layers.16.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
105. qwen2.layers.16.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
106. qwen2.layers.16.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
107. qwen2.layers.16.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
108. qwen2.layers.16.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
109. qwen2.layers.16.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
110. qwen2.layers.16.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
111. qwen2.layers.17.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
112. qwen2.layers.17.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
113. qwen2.layers.17.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
114. qwen2.layers.17.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
115. qwen2.layers.17.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
116. qwen2.layers.17.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
117. qwen2.layers.17.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
118. qwen2.layers.17.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
119. qwen2.layers.17.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
120. qwen2.layers.17.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
121. qwen2.layers.17.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
122. qwen2.layers.17.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
123. qwen2.layers.18.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
124. qwen2.layers.18.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
125. qwen2.layers.18.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
126. qwen2.layers.18.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
127. qwen2.layers.18.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
128. qwen2.layers.18.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
129. qwen2.layers.18.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
130. qwen2.layers.18.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
131. qwen2.layers.18.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
132. qwen2.layers.18.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
133. qwen2.layers.18.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
134. qwen2.layers.18.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
135. qwen2.layers.19.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
136. qwen2.layers.19.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
137. qwen2.layers.19.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
138. qwen2.layers.19.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
139. qwen2.layers.19.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
140. qwen2.layers.19.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
141. qwen2.layers.19.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
142. qwen2.layers.19.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
143. qwen2.layers.19.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
144. qwen2.layers.19.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
145. qwen2.layers.19.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
146. qwen2.layers.19.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
147. qwen2.layers.2.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
148. qwen2.layers.2.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
149. qwen2.layers.2.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
150. qwen2.layers.2.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
151. qwen2.layers.2.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
152. qwen2.layers.2.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
153. qwen2.layers.2.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
154. qwen2.layers.2.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
155. qwen2.layers.2.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
156. qwen2.layers.2.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
157. qwen2.layers.2.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
158. qwen2.layers.2.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
159. qwen2.layers.20.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
160. qwen2.layers.20.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
161. qwen2.layers.20.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
162. qwen2.layers.20.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
163. qwen2.layers.20.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
164. qwen2.layers.20.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
165. qwen2.layers.20.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
166. qwen2.layers.20.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
167. qwen2.layers.20.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
168. qwen2.layers.20.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
169. qwen2.layers.20.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
170. qwen2.layers.20.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
171. qwen2.layers.21.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
172. qwen2.layers.21.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
173. qwen2.layers.21.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
174. qwen2.layers.21.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
175. qwen2.layers.21.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
176. qwen2.layers.21.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
177. qwen2.layers.21.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
178. qwen2.layers.21.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
179. qwen2.layers.21.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
180. qwen2.layers.21.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
181. qwen2.layers.21.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
182. qwen2.layers.21.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
183. qwen2.layers.22.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
184. qwen2.layers.22.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
185. qwen2.layers.22.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
186. qwen2.layers.22.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
187. qwen2.layers.22.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
188. qwen2.layers.22.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
189. qwen2.layers.22.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
190. qwen2.layers.22.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
191. qwen2.layers.22.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
192. qwen2.layers.22.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
193. qwen2.layers.22.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
194. qwen2.layers.22.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
195. qwen2.layers.23.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
196. qwen2.layers.23.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
197. qwen2.layers.23.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
198. qwen2.layers.23.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
199. qwen2.layers.23.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
200. qwen2.layers.23.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
201. qwen2.layers.23.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
202. qwen2.layers.23.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
203. qwen2.layers.23.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
204. qwen2.layers.23.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
205. qwen2.layers.23.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
206. qwen2.layers.23.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
207. qwen2.layers.24.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
208. qwen2.layers.24.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
209. qwen2.layers.24.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
210. qwen2.layers.24.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
211. qwen2.layers.24.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
212. qwen2.layers.24.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
213. qwen2.layers.24.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
214. qwen2.layers.24.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
215. qwen2.layers.24.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
216. qwen2.layers.24.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
217. qwen2.layers.24.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
218. qwen2.layers.24.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
219. qwen2.layers.25.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
220. qwen2.layers.25.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
221. qwen2.layers.25.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
222. qwen2.layers.25.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
223. qwen2.layers.25.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
224. qwen2.layers.25.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
225. qwen2.layers.25.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
226. qwen2.layers.25.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
227. qwen2.layers.25.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
228. qwen2.layers.25.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
229. qwen2.layers.25.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
230. qwen2.layers.25.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
231. qwen2.layers.26.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
232. qwen2.layers.26.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
233. qwen2.layers.26.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
234. qwen2.layers.26.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
235. qwen2.layers.26.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
236. qwen2.layers.26.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
237. qwen2.layers.26.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
238. qwen2.layers.26.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
239. qwen2.layers.26.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
240. qwen2.layers.26.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
241. qwen2.layers.26.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
242. qwen2.layers.26.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
243. qwen2.layers.27.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
244. qwen2.layers.27.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
245. qwen2.layers.27.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
246. qwen2.layers.27.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
247. qwen2.layers.27.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
248. qwen2.layers.27.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
249. qwen2.layers.27.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
250. qwen2.layers.27.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
251. qwen2.layers.27.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
252. qwen2.layers.27.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
253. qwen2.layers.27.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
254. qwen2.layers.27.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
255. qwen2.layers.3.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
256. qwen2.layers.3.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
257. qwen2.layers.3.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
258. qwen2.layers.3.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
259. qwen2.layers.3.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
260. qwen2.layers.3.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
261. qwen2.layers.3.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
262. qwen2.layers.3.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
263. qwen2.layers.3.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
264. qwen2.layers.3.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
265. qwen2.layers.3.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
266. qwen2.layers.3.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
267. qwen2.layers.4.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
268. qwen2.layers.4.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
269. qwen2.layers.4.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
270. qwen2.layers.4.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
271. qwen2.layers.4.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
272. qwen2.layers.4.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
273. qwen2.layers.4.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
274. qwen2.layers.4.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
275. qwen2.layers.4.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
276. qwen2.layers.4.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
277. qwen2.layers.4.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
278. qwen2.layers.4.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
279. qwen2.layers.5.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
280. qwen2.layers.5.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
281. qwen2.layers.5.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
282. qwen2.layers.5.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
283. qwen2.layers.5.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
284. qwen2.layers.5.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
285. qwen2.layers.5.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
286. qwen2.layers.5.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
287. qwen2.layers.5.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
288. qwen2.layers.5.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
289. qwen2.layers.5.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
290. qwen2.layers.5.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
291. qwen2.layers.6.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
292. qwen2.layers.6.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
293. qwen2.layers.6.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
294. qwen2.layers.6.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
295. qwen2.layers.6.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
296. qwen2.layers.6.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
297. qwen2.layers.6.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
298. qwen2.layers.6.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
299. qwen2.layers.6.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
300. qwen2.layers.6.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
301. qwen2.layers.6.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
302. qwen2.layers.6.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
303. qwen2.layers.7.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
304. qwen2.layers.7.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
305. qwen2.layers.7.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
306. qwen2.layers.7.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
307. qwen2.layers.7.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
308. qwen2.layers.7.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
309. qwen2.layers.7.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
310. qwen2.layers.7.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
311. qwen2.layers.7.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
312. qwen2.layers.7.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
313. qwen2.layers.7.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
314. qwen2.layers.7.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
315. qwen2.layers.8.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
316. qwen2.layers.8.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
317. qwen2.layers.8.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
318. qwen2.layers.8.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
319. qwen2.layers.8.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
320. qwen2.layers.8.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
321. qwen2.layers.8.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
322. qwen2.layers.8.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
323. qwen2.layers.8.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
324. qwen2.layers.8.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
325. qwen2.layers.8.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
326. qwen2.layers.8.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
327. qwen2.layers.9.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
328. qwen2.layers.9.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560
329. qwen2.layers.9.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
330. qwen2.layers.9.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560
331. qwen2.layers.9.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
332. qwen2.layers.9.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
333. qwen2.layers.9.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
334. qwen2.layers.9.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
335. qwen2.layers.9.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536
336. qwen2.layers.9.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296
337. qwen2.layers.9.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256
338. qwen2.layers.9.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216
339. qwen2.norm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536