当前位置: 首页 > news >正文

Ubuntu xinference部署本地模型bge-large-zh-v1.5、bge-reranker-v2-m3

bge-large-zh-v1.5

下载模型到指定路径:

modelscope download --model BAAI/bge-large-zh-v1.5 --local_dir ./bge-large-zh-v1.5

自定义 embedding 模型,custom-bge-large-zh-v1.5.json:

{
    "model_name": "custom-bge-large-zh-v1.5",
    "dimensions": 1024,
    "max_tokens": 512,
    "language": ["zh"],
    "model_id": "BAAI/bge-large-zh-v1.5",
    "model_uri": "/path/to/bge-large-zh-v1.5"
}

注册自定义模型:

xinference register --model-type embedding --file custom-bge-large-zh-v1.5.json --persist

启动自定义模型:

xinference launch --model-name custom-bge-large-zh-v1.5 --model-type embedding

bge-reranker-v2-m3

下载模型到指定路径:

 modelscope download --model AI-ModelScope/bge-reranker-v2-m3 --local_dir ./bge-reranker-v2-m3

自定义 rerank 模型custom-bge-reranker-v2-m3.json

{
    "model_name": "custom-bge-reranker-v2-m3",
    "type": "normal",
    "language": ["en", "zh", "multilingual"],
    "model_id": "BAAI/bge-reranker-v2-m3",
    "model_uri": "/path/to/bge-reranker-v2-m3"
}

注册自定义模型:

xinference register --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

出现错误:

Traceback (most recent call last):
  File "//env/bin/xinference", line 8, in <module>
    sys.exit(cli())
  File "//env/lib/python3.10/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1082, in main
    rv = self.invoke(ctx)
  File "//env/lib/python3.10/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "//env/lib/python3.10/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "//env/lib/python3.10/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "//env/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 407, in register_model
    client.register_model(
  File "//env/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1188, in register_model
    raise RuntimeError(
RuntimeError: Failed to register model, detail: Not Found

成功(因为xinference部署在9999端口):

xinference register --endpoint http://localhost:9999 --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist

启动自定义模型:

xinference launch --model-type rerank --model-name custom-bge-reranker-v2-m3 --endpoint http://localhost:9999

验证模型加载成功,输出中会显示已加载的模型。

curl http://localhost:9999/v1/models
{"object":"list","data":[{"id":"custom-bge-large-zh-v1.5","object":"model","created":0,"owned_by":"xinference","model_type":"embedding","address":"0.0.0.0:39987","accelerators":[],"model_name":"custom-bge-large-zh-v1.5","dimensions":1024,"max_tokens":512,"language":["zh"],"model_revision":null,"replica":1},{"id":"custom-bge-reranker-v2-m3","object":"model","created":0,"owned_by":"xinference","model_type":"rerank","address":"0.0.0.0:44611","accelerators":[],"type":"normal","model_name":"custom-bge-reranker-v2-m3","language":["en","zh","multilingual"],"model_revision":null,"replica":1}]}(env) 

相关文章:

  • Headless Chrome 优化:减少内存占用与提速技巧
  • c++随记
  • 糖尿病大模型预测及临床应用研究智能管理系统技术文档
  • 线段树SegmentTree
  • HTML5 Audio(音频)学习笔记
  • Elasticsearch客户端工具初探--kibana
  • PyTorch处理数据--Dataset和DataLoader
  • Springboot高版本适配人大金仓
  • qtcore在docker容器中运行
  • string 的接口
  • 有额外限制的 bellman_ford 算法
  • Docker技术全景解析
  • 串行通信 与 并行通信 对比
  • 3、实际常用命令【待补充】
  • rocketmq零拷贝技术底层实现
  • PgDog:一个PostgreSQL分布式集群中间件
  • 【前端常用函数】
  • 达梦改密码时不想看到明文
  • 下载vmware17
  • SQL-木马植入、报错注入及其他
  • 财政部党组召开2025年巡视工作会议暨第一轮巡视动员部署会
  • 美凯龙:董事兼总经理车建兴被立案调查并留置
  • 上海杨浦:鼓励龙头企业与高校共建创新联合体,最高支持200万元
  • 为惩戒“工贼”,美国编剧工会“痛下杀手”
  • 上海现有超12.3万名注册护士,本科及以上学历占一半
  • 中美日内瓦经贸会谈联合声明