RLAgent note
OpenManus
github
LlamaGym
github
GRPO 实践
知乎:Deepseek R1 Zero成功复现
BabyAGI
0,环境
CUDA版本12.X:nvcc -V
python 3.10+:python -V
gcc 11:gcc -V
1,安装llama-cpp-python
[git | docs]
安装命令:
CMAKE_ARGS="-DGGML_CUDA=on -DLLAVA_BUILD=off" pip install llama-cpp-python -i https://pypi.tuna.tsinghua.edu.cn/simple
遇到的问题:
No CMAKE_CUDA_COMPILER could be found.
- 解决方案:直接把nvcc路径加入PATH
export PATH=$PATH:/user/local/cuda/bin
- 解决方案:直接把nvcc路径加入PATH
FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli
关于LLAVA的一堆报错- 解决方案:LLAVA多模态暂时不用先不编译了,加入这句:
DLLAVA_BUILD=off
- 解决方案:LLAVA多模态暂时不用先不编译了,加入这句:
2,hf_to_gguf模型转换
llama_cpp.Llama直接加载bin模型文件会报错:“gguf_init_from_file: invalid magic characters”,需要先把bin文件转换成gguf格式的模型文件,这边用llama.cpp。
安装llama.cpp:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
转换
python convert_hf_to_gguf.py \
--model Qwen2-7B-Instruct \
--outfile Qwen2-7B-Instruct-gguf
3, llama_cpp_python使用
from llama_cpp import Llama
model_path=Qwen2-7B-Instruct-gguf'
# 推理
llm = Llama(model_path=model_path, n_gpu_layers=-1) # n_gpu_layers=-1使用GPU
messages = [{"role": "system", "content": "你是一名作文老师."},
{"role": "user", "content": "写一篇作文,题目:春天"}]
output = llm.create_chat_completion(messages=messages)
# embedding
llm_embed = Llama(model_path=model_path, n_gpu_layers=-1, embedding=True)
e = llm_embed.embed('暑假我去北京旅游,玩的很开心')