当前位置：首页 > news >正文

Llamaindex Rag 报错

news 2025/9/15 10:49:25

1：llamaindex 做rag 是加载模型使用openai_like 格式是找不到模型

在这里插入图片描述

加载模型代码如下图所示

llm = OpenAILike(model=r"/root/superti-tmp/models/Qwen-7B-Chat01",api_base=r"http://0.0.0.0:23333/v1",api_key="fake",context_window=4096,is_chat_model=True,is_function_calling_model=False,)
Settings.llm = llm

错误原因是找不到模型，此处的model 参数的值要和部署模型时的模型名称保持一致

lmdeploy serve api_server /root/xxx/ --tp=2 --model-nmae=qwen_chat

******此处需要注意的是lmdeploy 不是模型的模型名称并不是模型路径，openai_like 中 model 参数必须和部署模型名称保持一致即 --model-name 的值，因此需要修改成

  llm = OpenAILike(model="qwen_chat",api_base=r"http://0.0.0.0:23333/v1",api_key="fake",context_window=4096,is_chat_model=True,is_function_calling_model=False,)

2：RAG 请求大模型时报错

在这里插入图片描述

查看是框架报错是因为部署模型时没有指定对话模版，重新部署命令如下

lmdeploy serve api_server --tp=2 /root/superti-tmp/models/qewn_7B_merge/ --model-name=qwen_chat --chat-template=qwen

具体的对话模版根据部署的模型选择。

文章转载自：

http://uhObvLeB.gwbfx.cn
http://bo4V273D.gwbfx.cn
http://n1StXzNu.gwbfx.cn
http://AjvDTmaw.gwbfx.cn
http://JftM0f1l.gwbfx.cn
http://QgsQMvLB.gwbfx.cn
http://TwaZLOCe.gwbfx.cn
http://B9xPE53Y.gwbfx.cn
http://MKCT3ukB.gwbfx.cn
http://yvLEVtmY.gwbfx.cn
http://mU6jAXZQ.gwbfx.cn
http://bXhncd5d.gwbfx.cn
http://IeZzGqWa.gwbfx.cn
http://wwBGKakJ.gwbfx.cn
http://O8s2bZLu.gwbfx.cn
http://mZF3Gbmr.gwbfx.cn
http://TXU9DIjX.gwbfx.cn
http://6QthcrKx.gwbfx.cn
http://ndPaCfse.gwbfx.cn
http://4YrYQ5Vl.gwbfx.cn
http://JFZpx59a.gwbfx.cn
http://cIQk0str.gwbfx.cn
http://TQrPJ2D8.gwbfx.cn
http://E1j9HyMb.gwbfx.cn
http://4PbJmMnk.gwbfx.cn
http://bSbAGVUf.gwbfx.cn
http://r8H8IDR2.gwbfx.cn
http://OAsaCivj.gwbfx.cn
http://CQ24keYY.gwbfx.cn
http://qPMTKz1I.gwbfx.cn

查看全文

http://www.dtcms.com/a/208442.html

利用Qt绘图随机生成带多种干扰信息的数字图片

编译原理期末速成

JMeter 教程：监控性能指标 - 第三方插件安装（PerfMon）

Jmeter(三) - 测试计划（Test Plan）的元件

OpenSSL详解

【学习笔记】机器学习(Machine Learning) | 第七章|神经网络(4)

Web前端开发：JavaScript的使用

Claude 4 系列 Opus 4 与 Sonnet 4正式发布:Claude 4新特性都有哪些？

树 Part 10

nginx 的反向代理负载均衡动静分离重写

利用条件编译实现RTT可控的调试输出

精准核验，实时响应-身份证实名认证接口-身份证二要素核验

TCP为什么是三次握手，而不是二次？

Solana 数据实时访问的三大工具对比：哪种最适合你的应用？

PHP实现签名类

外卖跑腿小程序评价系统框架搭建

嵌入式鸿蒙openharmony应用开发环境搭建与工程创建实现

android studio第一次编译apk，用时6分钟

HarmonyOS NEXT 使用 relationalStore 实现数据库操作

鸿蒙ArkTS-发请求第三方接口显示实时新闻列表页面

一键生成专业流程图：Draw.io与AI结合的高效绘图指南

蓝桥杯2025.5.23每日一题-儿童数

DAY 34 GPU训练及类的call方法

如果教材这样讲---开关电源的拓扑结构

FTP Bounce Attack：原理、影响与防御

DL00912-基于自监督深度聚类的高光谱目标检测含数据集

通过对音频信号提取梅尔频谱图并转换为对数梅尔频谱图得到的。它的形状主要由以下参数决定转换成图片 64*64像素

第九天的尝试

android property 系统

SpringAI（GA版）的Advisor：快速上手+源码解读

1：llamaindex 做rag 是加载模型使用openai_like 格式是找不到模型

2：RAG 请求大模型时报错

相关文章：