使用Optimum-habana对LLM模型训练推理
optimum-habana 是 Transformers 和 Diffusers 库与 Intel Gaudi AI 加速器(HPU) 之间的接口。它提供了一套工具,可轻松在单 HPU 和多 HPU 环境下针对不同下游任务进行模型加载、训练和推理。用户只需稍作修改,就可以在 Intel Gaudi 加速器上尝试数千个 Hugging Face 模型和相关任务。
1、官方已验证的模型和任务列表:
Transformers:
Architecture | Training | Inference | Tasks |
BERT | ✔️ | ✔️ | text classificationquestion answeringlanguage modelingtext feature extraction |
RoBERTa | ✔️ | ✔️ | question answeringlanguage modeling |
ALBERT | ✔️ | ✔️ | question answeringlanguage modeling |
DistilBERT | ✔️ | ✔️ | question answeringlanguage modeling |
GPT2 | ✔️ | ✔️ | language modelingtext generation |
BLOOM(Z) | DeepSpeed | text generation | |
StarCoder / StarCoder2 | ✔️ | Single-card | language modelingtext generation |
GPT-J | DeepSpeed | Single cardDeepSpeed | language modelingtext generation |
GPT-Neo | Single card | text generation | |
GPT-NeoX | DeepSpeed | DeepSpeed | language modelingtext generation |
OPT | DeepSpeed | text generation | |
Llama 2 / CodeLlama / Llama 3 / Llama Guard / Granite | ✔️ | ✔️ | language modelingtext generationquestion answeringtext classification (Llama Guard) |
StableLM | Single card | text generation | |
Falcon | LoRA | ✔️ | language modelingtext generation |
CodeGen | Single card | text generation | |
MPT | Single card |