当前位置: 首页 > news >正文

FastDeploy2.0:报qwen2.embed_tokens.weight

一、现象

DeepSeek-R1-Distill-Qwen-1.5B:通过modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B下载

执行如下命令:

python -m fastdeploy.entrypoints.openai.api_server        --model /workspace/models/DeepSeek-R1-Distill-Qwen-1.5B        --port 8180        --metrics-port 8181        --engine-worker-queue-port 8182        --max-model-len 8192        --max-num-seqs 1     --reasoning-parser qwen3

报如下错误:

[    INFO] - Starting to load model Qwen2ForCausalLM
[2025-08-04 10:59:57,546] [    INFO] - Attention is running in cache kv bfloat16 mode
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 730, in <module>
run_worker_proc()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 711, in run_worker_proc
worker_proc.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/engine/../worker/worker_process.py", line 409, in load_model
self.worker.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_worker.py", line 160, in load_model
self.model_runner.load_model()
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/worker/gpu_model_runner.py", line 725, in load_model
self.model = get_model_from_loader(fd_config=self.fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 54, in get_model_from_loader
model = model_loader.load_model(fd_config)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/model_loader.py", line 121, in load_model
model.set_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/decorator.py", line 235, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/base.py", line 396, in _decorate_function
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 333, in set_state_dict
self.qwen2.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/models/qwen2.py", line 264, in load_state_dict
self.embed_tokens.load_state_dict(state_dict)
File "/usr/local/lib/python3.10/dist-packages/fastdeploy/model_executor/layers/embeddings.py", line 111, in load_state_dict
get_tensor(state_dict.pop(self.prefix + ".weight")).astype(
KeyError: 'qwen2.embed_tokens.weight'

为什么会报qwen2.embed_tokens.weight这个错误呢,这个模型是qwen2架构,通过vllm及sglang发布都正常。

 通过执行如下代码,获取model.safetensors中的信息。

from safetensors import safe_opendef write_tensor_info_to_file(safetensors_file, output_file):with open(output_file, 'w') as f_out:f_out.write(f"正在处理文件: {safetensors_file}\n\n")with safe_open(safetensors_file, framework="pt") as f: tensor_names = f.keys()f_out.write(f"找到 {len(tensor_names)} 个张量:\n")for i, key in enumerate(tensor_names, 1):tensor = f.get_tensor(key)  # 返回 torch.Tensorshape = tensor.shapedtype = tensor.dtypef_out.write(f"{i:2d}. {key}\n")f_out.write(f"    形状: {shape}, 数据类型: {dtype}, 大小: {tensor.numel():,}\n")f_out.write("\n")# 调用
safetensors_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors"
output_file_path = "/work/DeepSeek-R1-Distill-Qwen-1.5B/model_summary.txt"write_tensor_info_to_file(safetensors_file_path, output_file_path)
print("完成!请查看输出文件:", output_file_path)

共获取339个张量,里面确实没有qwen2.embed_tokens.weight,model.embed_tokens.weight。那这是什么原因呢,这个张量还很重要。

model.embed_tokens.weight 是自然语言处理(NLP)模型中一个非常重要的参数,尤其是在基于Transformer架构的模型中,如Qwen、BERT、GPT等。这个权重矩阵主要负责将输入的词汇(token)转换为模型可以处理的向量形式,即进行词嵌入(embedding)

二、解决方案

百度的aistudio下载模型,就可以。

aistudio download --model PaddleNLP/DeepSeek-R1-Distill-Qwen-1.5B --local_dir d:\DeepSeek-R1-Distill-Qwen-1.5B

通过上面的代码,读取model.safetensors中的信息,发现确实有qwen2.embed_tokens.weight,也能正常发布。

正在处理文件: /work/DeepSeek-R1-Distill-Qwen-1.5B/model.safetensors

找到 339 个张量:
1. lm_head.weight
形状: torch.Size([1536, 151936]), 数据类型: torch.uint16, 大小: 233,373,696

 2. qwen2.embed_tokens.weight
形状: torch.Size([151936, 1536]), 数据类型: torch.uint16, 大小: 233,373,696

 3. qwen2.layers.0.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

 4. qwen2.layers.0.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

 5. qwen2.layers.0.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

 6. qwen2.layers.0.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

 7. qwen2.layers.0.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

 8. qwen2.layers.0.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

 9. qwen2.layers.0.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

10. qwen2.layers.0.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

11. qwen2.layers.0.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

12. qwen2.layers.0.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

13. qwen2.layers.0.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

14. qwen2.layers.0.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

15. qwen2.layers.1.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

16. qwen2.layers.1.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

17. qwen2.layers.1.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

18. qwen2.layers.1.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

19. qwen2.layers.1.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

20. qwen2.layers.1.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

21. qwen2.layers.1.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

22. qwen2.layers.1.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

23. qwen2.layers.1.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

24. qwen2.layers.1.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

25. qwen2.layers.1.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

26. qwen2.layers.1.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

27. qwen2.layers.10.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

28. qwen2.layers.10.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

29. qwen2.layers.10.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

30. qwen2.layers.10.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

31. qwen2.layers.10.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

32. qwen2.layers.10.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

33. qwen2.layers.10.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

34. qwen2.layers.10.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

35. qwen2.layers.10.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

36. qwen2.layers.10.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

37. qwen2.layers.10.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

38. qwen2.layers.10.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

39. qwen2.layers.11.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

40. qwen2.layers.11.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

41. qwen2.layers.11.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

42. qwen2.layers.11.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

43. qwen2.layers.11.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

44. qwen2.layers.11.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

45. qwen2.layers.11.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

46. qwen2.layers.11.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

47. qwen2.layers.11.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

48. qwen2.layers.11.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

49. qwen2.layers.11.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

50. qwen2.layers.11.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

51. qwen2.layers.12.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

52. qwen2.layers.12.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

53. qwen2.layers.12.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

54. qwen2.layers.12.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

55. qwen2.layers.12.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

56. qwen2.layers.12.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

57. qwen2.layers.12.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

58. qwen2.layers.12.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

59. qwen2.layers.12.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

60. qwen2.layers.12.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

61. qwen2.layers.12.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

62. qwen2.layers.12.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

63. qwen2.layers.13.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

64. qwen2.layers.13.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

65. qwen2.layers.13.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

66. qwen2.layers.13.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

67. qwen2.layers.13.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

68. qwen2.layers.13.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

69. qwen2.layers.13.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

70. qwen2.layers.13.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

71. qwen2.layers.13.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

72. qwen2.layers.13.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

73. qwen2.layers.13.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

74. qwen2.layers.13.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

75. qwen2.layers.14.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

76. qwen2.layers.14.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

77. qwen2.layers.14.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

78. qwen2.layers.14.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

79. qwen2.layers.14.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

80. qwen2.layers.14.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

81. qwen2.layers.14.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

82. qwen2.layers.14.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

83. qwen2.layers.14.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

84. qwen2.layers.14.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

85. qwen2.layers.14.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

86. qwen2.layers.14.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

87. qwen2.layers.15.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

88. qwen2.layers.15.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

89. qwen2.layers.15.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

90. qwen2.layers.15.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

91. qwen2.layers.15.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

92. qwen2.layers.15.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

93. qwen2.layers.15.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

94. qwen2.layers.15.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

95. qwen2.layers.15.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

96. qwen2.layers.15.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

97. qwen2.layers.15.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

98. qwen2.layers.15.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

99. qwen2.layers.16.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

100. qwen2.layers.16.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

101. qwen2.layers.16.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

102. qwen2.layers.16.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

103. qwen2.layers.16.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

104. qwen2.layers.16.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

105. qwen2.layers.16.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

106. qwen2.layers.16.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

107. qwen2.layers.16.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

108. qwen2.layers.16.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

109. qwen2.layers.16.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

110. qwen2.layers.16.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

111. qwen2.layers.17.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

112. qwen2.layers.17.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

113. qwen2.layers.17.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

114. qwen2.layers.17.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

115. qwen2.layers.17.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

116. qwen2.layers.17.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

117. qwen2.layers.17.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

118. qwen2.layers.17.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

119. qwen2.layers.17.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

120. qwen2.layers.17.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

121. qwen2.layers.17.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

122. qwen2.layers.17.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

123. qwen2.layers.18.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

124. qwen2.layers.18.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

125. qwen2.layers.18.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

126. qwen2.layers.18.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

127. qwen2.layers.18.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

128. qwen2.layers.18.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

129. qwen2.layers.18.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

130. qwen2.layers.18.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

131. qwen2.layers.18.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

132. qwen2.layers.18.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

133. qwen2.layers.18.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

134. qwen2.layers.18.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

135. qwen2.layers.19.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

136. qwen2.layers.19.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

137. qwen2.layers.19.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

138. qwen2.layers.19.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

139. qwen2.layers.19.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

140. qwen2.layers.19.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

141. qwen2.layers.19.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

142. qwen2.layers.19.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

143. qwen2.layers.19.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

144. qwen2.layers.19.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

145. qwen2.layers.19.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

146. qwen2.layers.19.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

147. qwen2.layers.2.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

148. qwen2.layers.2.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

149. qwen2.layers.2.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

150. qwen2.layers.2.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

151. qwen2.layers.2.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

152. qwen2.layers.2.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

153. qwen2.layers.2.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

154. qwen2.layers.2.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

155. qwen2.layers.2.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

156. qwen2.layers.2.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

157. qwen2.layers.2.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

158. qwen2.layers.2.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

159. qwen2.layers.20.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

160. qwen2.layers.20.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

161. qwen2.layers.20.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

162. qwen2.layers.20.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

163. qwen2.layers.20.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

164. qwen2.layers.20.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

165. qwen2.layers.20.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

166. qwen2.layers.20.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

167. qwen2.layers.20.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

168. qwen2.layers.20.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

169. qwen2.layers.20.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

170. qwen2.layers.20.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

171. qwen2.layers.21.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

172. qwen2.layers.21.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

173. qwen2.layers.21.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

174. qwen2.layers.21.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

175. qwen2.layers.21.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

176. qwen2.layers.21.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

177. qwen2.layers.21.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

178. qwen2.layers.21.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

179. qwen2.layers.21.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

180. qwen2.layers.21.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

181. qwen2.layers.21.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

182. qwen2.layers.21.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

183. qwen2.layers.22.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

184. qwen2.layers.22.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

185. qwen2.layers.22.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

186. qwen2.layers.22.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

187. qwen2.layers.22.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

188. qwen2.layers.22.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

189. qwen2.layers.22.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

190. qwen2.layers.22.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

191. qwen2.layers.22.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

192. qwen2.layers.22.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

193. qwen2.layers.22.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

194. qwen2.layers.22.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

195. qwen2.layers.23.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

196. qwen2.layers.23.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

197. qwen2.layers.23.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

198. qwen2.layers.23.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

199. qwen2.layers.23.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

200. qwen2.layers.23.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

201. qwen2.layers.23.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

202. qwen2.layers.23.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

203. qwen2.layers.23.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

204. qwen2.layers.23.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

205. qwen2.layers.23.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

206. qwen2.layers.23.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

207. qwen2.layers.24.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

208. qwen2.layers.24.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

209. qwen2.layers.24.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

210. qwen2.layers.24.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

211. qwen2.layers.24.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

212. qwen2.layers.24.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

213. qwen2.layers.24.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

214. qwen2.layers.24.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

215. qwen2.layers.24.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

216. qwen2.layers.24.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

217. qwen2.layers.24.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

218. qwen2.layers.24.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

219. qwen2.layers.25.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

220. qwen2.layers.25.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

221. qwen2.layers.25.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

222. qwen2.layers.25.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

223. qwen2.layers.25.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

224. qwen2.layers.25.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

225. qwen2.layers.25.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

226. qwen2.layers.25.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

227. qwen2.layers.25.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

228. qwen2.layers.25.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

229. qwen2.layers.25.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

230. qwen2.layers.25.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

231. qwen2.layers.26.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

232. qwen2.layers.26.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

233. qwen2.layers.26.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

234. qwen2.layers.26.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

235. qwen2.layers.26.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

236. qwen2.layers.26.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

237. qwen2.layers.26.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

238. qwen2.layers.26.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

239. qwen2.layers.26.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

240. qwen2.layers.26.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

241. qwen2.layers.26.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

242. qwen2.layers.26.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

243. qwen2.layers.27.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

244. qwen2.layers.27.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

245. qwen2.layers.27.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

246. qwen2.layers.27.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

247. qwen2.layers.27.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

248. qwen2.layers.27.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

249. qwen2.layers.27.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

250. qwen2.layers.27.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

251. qwen2.layers.27.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

252. qwen2.layers.27.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

253. qwen2.layers.27.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

254. qwen2.layers.27.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

255. qwen2.layers.3.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

256. qwen2.layers.3.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

257. qwen2.layers.3.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

258. qwen2.layers.3.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

259. qwen2.layers.3.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

260. qwen2.layers.3.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

261. qwen2.layers.3.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

262. qwen2.layers.3.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

263. qwen2.layers.3.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

264. qwen2.layers.3.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

265. qwen2.layers.3.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

266. qwen2.layers.3.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

267. qwen2.layers.4.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

268. qwen2.layers.4.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

269. qwen2.layers.4.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

270. qwen2.layers.4.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

271. qwen2.layers.4.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

272. qwen2.layers.4.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

273. qwen2.layers.4.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

274. qwen2.layers.4.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

275. qwen2.layers.4.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

276. qwen2.layers.4.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

277. qwen2.layers.4.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

278. qwen2.layers.4.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

279. qwen2.layers.5.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

280. qwen2.layers.5.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

281. qwen2.layers.5.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

282. qwen2.layers.5.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

283. qwen2.layers.5.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

284. qwen2.layers.5.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

285. qwen2.layers.5.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

286. qwen2.layers.5.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

287. qwen2.layers.5.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

288. qwen2.layers.5.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

289. qwen2.layers.5.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

290. qwen2.layers.5.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

291. qwen2.layers.6.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

292. qwen2.layers.6.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

293. qwen2.layers.6.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

294. qwen2.layers.6.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

295. qwen2.layers.6.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

296. qwen2.layers.6.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

297. qwen2.layers.6.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

298. qwen2.layers.6.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

299. qwen2.layers.6.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

300. qwen2.layers.6.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

301. qwen2.layers.6.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

302. qwen2.layers.6.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

303. qwen2.layers.7.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

304. qwen2.layers.7.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

305. qwen2.layers.7.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

306. qwen2.layers.7.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

307. qwen2.layers.7.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

308. qwen2.layers.7.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

309. qwen2.layers.7.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

310. qwen2.layers.7.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

311. qwen2.layers.7.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

312. qwen2.layers.7.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

313. qwen2.layers.7.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

314. qwen2.layers.7.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

315. qwen2.layers.8.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

316. qwen2.layers.8.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

317. qwen2.layers.8.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

318. qwen2.layers.8.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

319. qwen2.layers.8.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

320. qwen2.layers.8.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

321. qwen2.layers.8.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

322. qwen2.layers.8.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

323. qwen2.layers.8.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

324. qwen2.layers.8.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

325. qwen2.layers.8.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

326. qwen2.layers.8.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

327. qwen2.layers.9.input_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

328. qwen2.layers.9.mlp.down_proj.weight
形状: torch.Size([8960, 1536]), 数据类型: torch.uint16, 大小: 13,762,560

329. qwen2.layers.9.mlp.gate_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

330. qwen2.layers.9.mlp.up_proj.weight
形状: torch.Size([1536, 8960]), 数据类型: torch.uint16, 大小: 13,762,560

331. qwen2.layers.9.post_attention_layernorm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

332. qwen2.layers.9.self_attn.k_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

333. qwen2.layers.9.self_attn.k_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

334. qwen2.layers.9.self_attn.o_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

335. qwen2.layers.9.self_attn.q_proj.bias
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

336. qwen2.layers.9.self_attn.q_proj.weight
形状: torch.Size([1536, 1536]), 数据类型: torch.uint16, 大小: 2,359,296

337. qwen2.layers.9.self_attn.v_proj.bias
形状: torch.Size([256]), 数据类型: torch.uint16, 大小: 256

338. qwen2.layers.9.self_attn.v_proj.weight
形状: torch.Size([1536, 256]), 数据类型: torch.uint16, 大小: 393,216

339. qwen2.norm.weight
形状: torch.Size([1536]), 数据类型: torch.uint16, 大小: 1,536

http://www.dtcms.com/a/315769.html

相关文章:

  • 2.4 组件通信
  • 24. 前端-js框架-Vue
  • Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
  • Python高级编程与实践:Python性能分析与优化
  • Java技术栈/面试题合集(3)-Java并发篇
  • 【功能测试】软件功能上线测试经验总结
  • 场外个股期权的额度为何受限?
  • java web 服务员点餐系统demo 比较完整的前端后端+mysql + 图片上传 练习
  • 从审批流到审计链:刻录系统的全周期管控技术解析
  • Spring MVC框架中DispatcherServlet详解
  • 【开源工具】基于Python的PDF清晰度增强工具全解析(附完整源码)
  • LeetCode算法日记 - Day 2: 快乐数、盛水最多容器
  • 力扣经典算法篇-43-全排列(经典回溯问题)
  • vite面试题及详细答案120题(01-30)
  • 普通树状数组
  • 《Node.js与 Elasticsearch的全文搜索架构解析》
  • Leetcode 13 java
  • 2025-08-05Gitee + PicGo + Typora搭建免费图床
  • MongoDB学习专题(二)核心操作
  • MongoDB 从3.4.0升级到4.0.0完整指南实战-优雅草蜻蜓I即时通讯水银版成功升级-卓伊凡|bigniu
  • 时序数据库flux aggregateWindow命令详解
  • Baumer相机如何通过YoloV8深度学习模型实现道路场所路人口罩的检测识别(C#代码UI界面版)
  • 概率论之条件概率
  • ubuntu自动重启BUG排查指南
  • C++ - 仿 RabbitMQ 实现消息队列--服务端核心模块实现(六)
  • Go 单元测试:如何只运行某个测试函数(精确控制)
  • C++ 网络编程入门:TCP 协议下的简易计算器项目
  • 【STM32】HAL库中的实现(四):RTC (实时时钟)
  • 日语学习-日语知识点小记-构建基础-JLPT-N3阶段(14):文法:ていく+きた+单词
  • MQTT学习