当前位置：首页 > news >正文

告别复杂配置：用Milvus、RustFS和Vibe Coding，60分钟DIY专属Chatbot

news 2025/10/10 10:57:28

作为一名全栈工程师，我一直在寻找更高效的方式构建智能应用。

本文将分享如何利用 Milvus、RustFS 和 Vibe Coding 技术组合，在短时间内打造一个具备长期记忆能力的对话机器人。

一、为什么选择这个技术栈？

1.1 各组件核心价值

二、环境搭建：10分钟快速开始

2.1 使用Docker Compose一键部署

2.2 Python环境配置

三、知识库构建：让Chatbot拥有长期记忆

3.1 文档加载与向量化

3.2 创建Milvus集合（Collection）

3.3 知识库入库流程

四、RAG引擎实现：智能问答的核心

4.1 检索增强生成（RAG）架构

4.2 对话记忆管理

五、后端API开发：FastAPI快速实现

5.1 创建高效的Web API

六、前端界面：Next.js现代化聊天界面

6.1 使用Vibe Coding理念快速开发前端

6.2 现代化样式设计

七、部署与优化

7.1 性能优化建议

7.2 生产环境部署

八、总结与展望

一、为什么选择这个技术栈？

在当今AI应用开发领域，选择合适的底层基础设施至关重要。经过多个项目的实践验证，我发现了这个黄金组合：

技术	优势	在Chatbot中的作用	替代方案对比
Milvus	专为向量优化，亿级数据毫秒级检索	存储对话记忆和知识库	比Pinecone成本低70%，比Chroma稳定5倍
RustFS	S3兼容，高性能，轻量安全	存储文档和多媒体资源	比MinIO内存占用少60%，比AWS S3成本降90%
Vibe Coding	快速原型，AI辅助开发	加速前端和API开发	开发效率提升3倍，代码量减少50%

1.1 各组件核心价值

Milvus：作为开源向量数据库，它专门为AI场景优化，支持：

高性能相似性搜索：基于HNSW等先进索引算法，实现毫秒级响应
弹性扩展：轻松处理从数万到数十亿的向量数据
多模态支持：不仅支持文本，还支持图像、音频等向量

RustFS：完全兼容S3协议的对象存储，优势包括：

极致性能：4K随机读IOPS达1.58M，比传统方案快40%+
成本优势：自建部署相比公有云对象存储成本下降90%
轻量安全：基于Rust语言，单二进制文件不足100MB

Vibe Coding：一种高效的开发方法论，强调：

快速迭代：基于AI代码生成和组件化思维
用户体验优先：快速构建直观的前端界面
自动化运维：基础设施即代码，一键部署

二、环境搭建：10分钟快速开始

2.1 使用Docker Compose一键部署

创建docker-compose.yml文件，集成所有必要服务：

version: '3.8'
services:# Milvus向量数据库etcd:container_name: milvus-etcdimage: quay.io/coreos/etcd:v3.5.18environment:- ETCD_AUTO_COMPACTION_MODE=revision- ETCD_AUTO_COMPACTION_RETENTION=1000volumes:- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcdcommand: etcd -advertise-client-urls=http://etcd:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcdhealthcheck:test: ["CMD", "etcdctl", "endpoint", "health"]interval: 30stimeout: 20sretries: 3# RustFS对象存储（替代MinIO）rustfs:container_name: milvus-rustfsimage: rustfs/rustfs:1.0.0-alpha.58environment:- RUSTFS_VOLUMES=/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3- RUSTFS_ADDRESS=0.0.0.0:9000- RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001- RUSTFS_CONSOLE_ENABLE=true- RUSTFS_ACCESS_KEY=rustfsadmin- RUSTFS_SECRET_KEY=rustfsadminports:- "9000:9000"  # S3 API端口- "9001:9001"  # 控制台端口volumes:- rustfs_data_0:/data/rustfs0- rustfs_data_1:/data/rustfs1- rustfs_data_2:/data/rustfs2- rustfs_data_3:/data/rustfs3restart: unless-stoppedhealthcheck:test: ["CMD", "sh", "-c", "curl -f http://localhost:9000/health && curl -f http://localhost:9001/health"]interval: 30stimeout: 10sretries: 3start_period: 40s# Milvus向量数据库milvus-standalone:container_name: milvus-standaloneimage: milvusdb/milvus:v2.6.0command: ["milvus", "run", "standalone"]environment:ETCD_ENDPOINTS: etcd:2379MINIO_ADDRESS: rustfs:9000  # 使用RustFS作为存储后端volumes:- ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvusports:- "19530:19530"- "9091:9091"depends_on:- "etcd"- "rustfs"healthcheck:test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]interval: 30sstart_period: 90stimeout: 20sretries: 3# Attu：Milvus的可视化管理界面attu:container_name: milvus-attuimage: zilliz/attu:v2.6environment:- MILVUS_URL=milvus-standalone:19530ports:- "8000:3000"restart: unless-stoppedvolumes:rustfs_data_0:rustfs_data_1:rustfs_data_2:rustfs_data_3:

启动所有服务：

docker compose up -d

验证服务状态：

docker ps

应该看到4个容器正常运行，对应端口：

Milvus: 19530 (API), 9091 (健康检查)
RustFS: 9000 (S3 API), 9001 (控制台)
Attu: 8000 (Web界面)
etcd: 2379 (内部通信)

2.2 Python环境配置

创建虚拟环境并安装必要依赖：

python -m venv chatbot-env
source chatbot-env/bin/activate  # Linux/macOS
# 或 chatbot-env\Scripts\activate  # Windowspip install pymilvus==2.6.0
pip install openai==1.3.0
pip install boto3==1.28.0  # 用于连接RustFS
pip install fastapi==0.104.0
pip install uvicorn==0.24.0
pip install python-multipart==0.0.6

三、知识库构建：让Chatbot拥有长期记忆

3.1 文档加载与向量化

首先，我们需要将知识文档转换为向量并存储到Milvus中。以下代码演示了如何处理Markdown格式的文档：

import os
import glob
from milvus import Milvus, DataTypedef load_markdown_files(folder_path):"""加载Markdown文档"""files = glob.glob(os.path.join(folder_path, "**", "*.md"), recursive=True)docs = []for file_path in files:with open(file_path, "r", encoding="utf-8") as f:content = f.read()docs.append({"file_path": file_path,"content": content,"file_name": os.path.basename(file_path)})return docsdef split_into_chunks(text, max_length=500):"""将长文本分割为适合向量化的块"""chunks = []current_chunk = []current_length = 0for line in text.split("\n"):line_length = len(line)if current_length + line_length < max_length:current_chunk.append(line)current_length += line_lengthelse:if current_chunk:chunks.append(" ".join(current_chunk))current_chunk = [line]current_length = line_lengthif current_chunk:chunks.append(" ".join(current_chunk))return chunksdef get_embedding(text, model="text-embedding-3-large"):"""使用OpenAI API获取文本向量"""from openai import OpenAIclient = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))response = client.embeddings.create(model=model,input=text)return [data.embedding for data in response.data]

3.2 创建Milvus集合（Collection）

设置向量数据库的结构，优化检索性能：

def create_milvus_collection():"""创建Milvus集合用于存储文档向量"""client = Milvus(host='localhost', port='19530')# 定义集合结构collection_name = "knowledge_base"# 如果集合已存在，先删除if client.has_collection(collection_name):client.drop_collection(collection_name)# 创建字段定义fields = [FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),FieldSchema(name="file_name", dtype=DataType.VARCHAR, max_length=500),FieldSchema(name="content", dtype=DataType.VARCHAR, max_length=10000),FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=3072)  # text-embedding-3-large维度]schema = CollectionSchema(fields, description="知识库文档集合")collection = Collection(name=collection_name, schema=schema)# 创建索引优化检索速度index_params = {"index_type": "HNSW","metric_type": "L2", "params": {"M": 8, "efConstruction": 64}}collection.create_index("embedding", index_params)return collection

3.3 知识库入库流程

将文档处理并存入向量数据库的完整流程：

def build_knowledge_base(docs_folder_path):"""构建知识库的完整流程"""# 1. 加载文档print("加载Markdown文档...")documents = load_markdown_files(docs_folder_path)print(f"共加载 {len(documents)} 个文档")# 2. 创建Milvus集合collection = create_milvus_collection()all_chunks = []all_embeddings = []# 3. 处理每个文档for doc in documents:chunks = split_into_chunks(doc["content"])print(f"文档 {doc['file_name']} 分割为 {len(chunks)} 个块")for chunk in chunks:all_chunks.append({"file_name": doc["file_name"],"content": chunk})# 4. 批量生成向量（减少API调用）batch_size = 10  # OpenAI限制for i in range(0, len(all_chunks), batch_size):batch_chunks = all_chunks[i:i+batch_size]batch_texts = [chunk["content"] for chunk in batch_chunks]print(f"生成向量 {i+1}-{i+len(batch_chunks)}/{len(all_chunks)}")embeddings = get_embedding(batch_texts)all_embeddings.extend(embeddings)# 5. 存入Milvusentities = [[chunk["file_name"] for chunk in all_chunks],[chunk["content"] for chunk in all_chunks],all_embeddings]collection.insert(entities)collection.flush()print(f"知识库构建完成，共存入 {len(all_chunks)} 个文档块")return collection

四、RAG引擎实现：智能问答的核心

4.1 检索增强生成（RAG）架构

RAG系统结合了检索器和生成器的优势，确保回答基于事实知识而非模型臆想。

class RAGEngine:def __init__(self, milvus_host='localhost', milvus_port=19530):self.client = Milvus(host=milvus_host, port=milvus_port)self.collection = Collection("knowledge_base")self.collection.load()def retrieve_similar_docs(self, query, top_k=3):"""检索与查询最相关的文档"""# 将查询转换为向量query_embedding = get_embedding([query])[0]# 在Milvus中搜索相似文档search_params = {"metric_type": "L2", "params": {"nprobe": 10}}results = self.collection.search(data=[query_embedding],anns_field="embedding",param=search_params,limit=top_k,output_fields=["file_name", "content"])# 提取相关文档内容relevant_docs = []for hits in results:for hit in hits:relevant_docs.append({"file_name": hit.entity.get("file_name"),"content": hit.entity.get("content"),"score": hit.score})return relevant_docsdef generate_answer(self, query, context_docs):"""基于检索到的上下文生成回答"""from openai import OpenAIclient = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))# 构建上下文context = "\n\n".join([doc["content"] for doc in context_docs])# 构建promptprompt = f"""基于以下上下文信息，回答用户的问题。如果上下文信息不足以回答问题，请如实告知。上下文信息：
{context}用户问题：{query}请根据上下文提供准确、有用的回答："""response = client.chat.completions.create(model="gpt-4",messages=[{"role": "system", "content": "你是一个专业的助手，根据提供的上下文信息准确回答用户问题。"},{"role": "user", "content": prompt}],temperature=0.1  # 低温度值确保回答稳定性)return response.choices[0].message.contentdef query(self, question, top_k=3):"""完整的RAG查询流程"""# 1. 检索相关文档relevant_docs = self.retrieve_similar_docs(question, top_k)# 2. 生成回答answer = self.generate_answer(question, relevant_docs)return {"answer": answer,"sources": relevant_docs}

4.2 对话记忆管理

为了让Chatbot具备多轮对话能力，需要实现对话历史管理：

class ConversationManager:def __init__(self, max_history=10):self.conversation_history = []self.max_history = max_historydef add_message(self, role, content):"""添加对话消息"""self.conversation_history.append({"role": role, "content": content})# 保持历史记录长度if len(self.conversation_history) > self.max_history * 2:  # 用户和助手消息各10条self.conversation_history = self.conversation_history[-self.max_history * 2:]def get_conversation_context(self):"""获取当前对话上下文"""return self.conversation_history.copy()def clear_history(self):"""清空对话历史"""self.conversation_history = []

五、后端API开发：FastAPI快速实现

5.1 创建高效的Web API

使用FastAPI构建RESTful接口，支持前端调用：

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import uvicornapp = FastAPI(title="Chatbot API", version="1.0.0")# 允许跨域请求
app.add_middleware(CORSMiddleware,allow_origins=["*"],allow_methods=["*"],allow_headers=["*"],
)# 数据模型
class ChatRequest(BaseModel):message: strconversation_id: str = Noneclass ChatResponse(BaseModel):answer: strsources: listconversation_id: str# 全局实例
rag_engine = RAGEngine()
conversation_managers = {}  # 按会话ID管理对话历史def get_conversation_manager(conversation_id):"""获取或创建对话管理器"""if conversation_id not in conversation_managers:conversation_managers[conversation_id] = ConversationManager()return conversation_managers[conversation_id]@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(request: ChatRequest):try:conversation_mgr = get_conversation_manager(request.conversation_id or "default")# 添加用户消息到历史conversation_mgr.add_message("user", request.message)# 获取对话上下文用于增强检索conversation_context = conversation_mgr.get_conversation_context()contextual_query = self._build_contextual_query(request.message, conversation_context)# 使用RAG引擎获取回答result = rag_engine.query(contextual_query)# 添加助手回答到历史conversation_mgr.add_message("assistant", result["answer"])return ChatResponse(answer=result["answer"],sources=result["sources"],conversation_id=request.conversation_id or "default")except Exception as e:raise HTTPException(status_code=500, detail=str(e))def _build_contextual_query(current_query, conversation_history):"""结合对话历史构建上下文相关的查询"""if len(conversation_history) <= 2:  # 只有当前查询return current_query# 提取最近几轮对话作为上下文recent_history = conversation_history[-4:]  # 最近两轮对话（用户+助手）context = "之前的对话背景："for msg in recent_history:role = "用户" if msg["role"] == "user" else "助手"context += f"\n{role}: {msg['content']}"return f"{context}\n\n基于以上对话背景，当前问题：{current_query}"@app.get("/health")
async def health_check():return {"status": "healthy", "service": "chatbot-api"}if __name__ == "__main__":uvicorn.run(app, host="0.0.0.0", port=8000)

六、前端界面：Next.js现代化聊天界面

6.1 使用Vibe Coding理念快速开发前端

Vibe Coding强调快速原型开发，以下是关键代码：

import React, { useState, useRef, useEffect } from 'react';export default function ChatbotInterface() {const [messages, setMessages] = useState([]);const [input, setInput] = useState('');const [loading, setLoading] = useState(false);const messagesEndRef = useRef(null);const scrollToBottom = () => {messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });};useEffect(scrollToBottom, [messages]);const sendMessage = async () => {if (!input.trim()) return;const userMessage = { role: 'user', content: input };setMessages(prev => [...prev, userMessage]);setInput('');setLoading(true);try {const response = await fetch('http://localhost:8000/chat', {method: 'POST',headers: { 'Content-Type': 'application/json' },body: JSON.stringify({ message: input,conversation_id: 'default' }),});const data = await response.json();const botMessage = { role: 'assistant', content: data.answer,sources: data.sources };setMessages(prev => [...prev, botMessage]);} catch (error) {console.error('Error:', error);const errorMessage = { role: 'assistant', content: '抱歉，暂时无法回答问题，请稍后重试。' };setMessages(prev => [...prev, errorMessage]);} finally {setLoading(false);}};return (<div className="chatbot-container"><div className="messages-container">{messages.map((msg, index) => (<div key={index} className={`message ${msg.role}`}><div className="message-content">{msg.content}{msg.sources && (<div className="sources"><small>参考来源: {msg.sources.map(s => s.file_name).join(', ')}</small></div>)}</div></div>))}{loading && (<div className="message assistant"><div className="message-content"><div className="typing-indicator"><span></span><span></span><span></span></div></div></div>)}<div ref={messagesEndRef} /></div><div className="input-container"><inputtype="text"value={input}onChange={(e) => setInput(e.target.value)}onKeyPress={(e) => e.key === 'Enter' && sendMessage()}placeholder="输入您的问题..."disabled={loading}/><button onClick={sendMessage} disabled={loading}>发送</button></div></div>);
}

6.2 现代化样式设计

使用CSS-in-JS实现美观的聊天界面：

<style jsx>{`.chatbot-container {max-width: 800px;margin: 0 auto;height: 100vh;display: flex;flex-direction: column;background: #f5f5f5;}.messages-container {flex: 1;overflow-y: auto;padding: 20px;}.message {margin: 10px 0;display: flex;}.message.user {justify-content: flex-end;}.message.assistant {justify-content: flex-start;}.message-content {max-width: 70%;padding: 12px 16px;border-radius: 18px;word-wrap: break-word;}.message.user .message-content {background: #007aff;color: white;}.message.assistant .message-content {background: white;color: #333;border: 1px solid #ddd;}.input-container {display: flex;padding: 20px;background: white;border-top: 1px solid #ddd;}.input-container input {flex: 1;padding: 12px;border: 1px solid #ddd;border-radius: 24px;margin-right: 10px;}.input-container button {padding: 12px 24px;background: #007aff;color: white;border: none;border-radius: 24px;cursor: pointer;}.typing-indicator {display: flex;align-items: center;}.typing-indicator span {height: 8px;width: 8px;background: #999;border-radius: 50%;margin: 0 2px;animation: bounce 1.3s infinite ease-in-out;}.typing-indicator span:nth-child(1) { animation-delay: -0.32s; }.typing-indicator span:nth-child(2) { animation-delay: -0.16s; }@keyframes bounce {0%, 80%, 100% { transform: scale(0); }40% { transform: scale(1); }}
`}</style>