智能体流程:自拍照片处理与六宫格图像生成
智能体流程:自拍照片处理与六宫格图像生成
概述
本文将详细介绍一个完整的智能体流程,用于处理用户自拍照片并生成六宫格图像。该系统主要包括以下功能模块:
- 用户自拍照片上传与预处理
- 基于参考图像生成高度相似的关键词(相似度80%以上)
- 生成6张风格一致但姿势/角度/动作不同的新图像
- 将生成的图像合成为六宫格实景示例
- 保存结果到指定文件夹
本文将使用Python作为主要开发语言,结合多种计算机视觉和深度学习库实现上述功能。
目录
- 系统架构设计
- 环境配置与依赖库
- 用户自拍照片上传与预处理模块
- 图像特征提取与相似度计算
- 提示词生成模块
- 多样化图像生成模块
- 六宫格合成模块
- 完整流程整合
- 性能优化与错误处理
- 测试与验证
- 部署方案
- 总结与展望
1. 系统架构设计
本系统采用模块化设计,各个功能模块相对独立但又协同工作。整体架构如下:
用户界面层│▼
API接口层(RESTful API)│▼
业务逻辑层├── 图像上传与预处理模块├── 特征提取与相似度计算模块├── 提示词生成模块├── 多样化图像生成模块└── 六宫格合成模块│▼
数据存储层├── 原始图像存储├── 处理中间结果存储└── 最终结果存储
系统工作流程:
- 用户通过Web界面上传自拍照片
- 系统对照片进行预处理和质量评估
- 提取图像特征并与参考图库进行相似度比对
- 生成高度相似的提示词
- 使用提示词生成6张风格一致但不同的图像
- 将6张图像合成为六宫格
- 保存结果并提供下载链接
2. 环境配置与依赖库
2.1 所需Python库
# 核心库
python >= 3.8# 图像处理
opencv-python >= 4.5.0
Pillow >= 8.0.0
scikit-image >= 0.18.0# 深度学习框架
torch >= 1.8.0
torchvision >= 0.9.0
transformers >= 4.0.0
diffusers >= 0.10.0# 特征提取与相似度
face-recognition >= 1.3.0
scikit-learn >= 0.24.0# 工具库
numpy >= 1.20.0
matplotlib >= 3.3.0
seaborn >= 0.11.0# Web框架
flask >= 2.0.0
flask-restful >= 0.3.0# 文件处理
werkzeug >= 2.0.0
python-multipart >= 0.0.5
2.2 环境安装脚本
#!/bin/bash
# 创建虚拟环境
python -m venv photo_agent_env
source photo_agent_env/bin/activate# 安装基础依赖
pip install --upgrade pip
pip install opencv-python Pillow scikit-image
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113 # CUDA 11.3
pip install transformers diffusers
pip install face-recognition scikit-learn
pip install numpy matplotlib seaborn
pip install flask flask-restful werkzeug python-multipart# 创建必要的目录结构
mkdir -p uploads processed references generated grids logs
3. 用户自拍照片上传与预处理模块
3.1 照片上传API
from flask import Flask, request, jsonify
from werkzeug.utils import secure_filename
import os
import uuid
from datetime import datetimeapp = Flask(__name__)# 配置
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024 # 16MB限制
app.config['UPLOAD_FOLDER'] = 'uploads/'
app.config['ALLOWED_EXTENSIONS'] = {'png', 'jpg', 'jpeg', 'bmp', 'tiff'}def allowed_file(filename):return '.' in filename and \filename.rsplit('.', 1)[1].lower() in app.config['ALLOWED_EXTENSIONS']@app.route('/upload', methods=['POST'])
def upload_file():"""上传用户自拍照片的API端点"""# 检查是否有文件部分if 'file' not in request.files:return jsonify({'error': 'No file part'}), 400file = request.files['file']# 检查是否选择了文件if file.filename == '':return jsonify({'error': 'No selected file'}), 400# 检查文件类型if file and allowed_file(file.filename):# 生成唯一文件名filename = secure_filename(file.filename)unique_filename = f"{uuid.uuid4().hex}_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}"filepath = os.path.join(app.config['UPLOAD_FOLDER'], unique_filename)# 保存文件file.save(filepath)# 返回文件信息return jsonify({'message': 'File uploaded successfully','filename': unique_filename,'filepath': filepath}), 200else:return jsonify({'error': 'File type not allowed'}), 400
3.2 图像预处理类
import cv2
import numpy as np
from PIL import Image, ImageEnhance, ImageFilter
import loggingclass ImagePreprocessor:"""图像预处理类,负责对上传的图像进行质量检查和预处理"""def __init__(self):self.logger = logging.getLogger(__name__)def load_image(self, image_path):"""加载图像文件"""try:image = Image.open(image_path)return imageexcept Exception as e:self.logger.error(f"Error loading image: {e}")return Nonedef check_image_quality(self, image):"""检查图像质量,返回质量评分和问题列表"""quality_score = 100 # 初始分数issues = []# 转换为OpenCV格式进行处理cv_image = np.array(image.convert('RGB'))cv_image = cv_image[:, :, ::-1].copy() # RGB to BGR# 检查图像尺寸height, width = cv_image.shape[:2]if height < 300 or width < 300:quality_score -= 30issues.append("Image dimensions too small")# 检查图像模糊度blur_value = cv2.Laplacian(cv_image, cv2.CV_64F).var()if blur_value < 100:quality_score -= 30issues.append("Image is too blurry")# 检查亮度hsv = cv2.cvtColor(cv_image, cv2.COLOR_BGR2HSV)brightness = np.mean(hsv[:,:,2])if brightness < 50:quality_score -= 20issues.append("Image is too dark")elif brightness > 200:quality_score -= 20issues.append("Image is overexposed")# 检查对比度contrast = np.std(cv_image)if contrast < 40:quality_score -= 20issues.append("Low contrast")return max(quality_score, 0), issuesdef preprocess_image(self, image_path, output_path=None):"""对图像进行预处理,包括尺寸调整、增强等"""# 加载图像image = self.load_image(image_path)if image is None:return None# 检查质量quality_score, issues = self.check_image_quality(image)self.logger.info(f"Image quality score: {quality_score}, Issues: {issues}")# 如果质量低于阈值,尝试增强if quality_score < 70:self.logger.info("Attempting to enhance low quality image")image = self.enhance_image(image)# 调整尺寸(保持宽高比)max_size = (1024, 1024)image.thumbnail(max_size, Image.Resampling.LANCZOS)# 保存处理后的图像if output_path:image.save(output_path, quality=95)return image, quality_score, issuesdef enhance_image(self, image):"""增强图像质量"""# 增强对比度enhancer = ImageEnhance.Contrast(image)image = enhancer.enhance(1.2)# 增强锐度enhancer = ImageEnhance.Sharpness(image)image = enhancer.enhance(1.1)# 轻微降噪image = image.filter(ImageFilter.MedianFilter(3))return image
4. 图像特征提取与相似度计算
4.1 特征提取模块
import face_recognition
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import PCA
import numpy as np
import osclass FeatureExtractor:"""特征提取类,用于从图像中提取人脸和风格特征"""def __init__(self):self.reference_features = self.load_reference_features()def load_reference_features(self):"""加载参考图像的特征"""# 这里应该是从数据库或文件中加载预计算的特征# 简化示例,实际应用中需要实现完整的参考图库管理return {}def extract_face_features(self, image_path):"""提取人脸特征"""try:# 加载图像image = face_recognition.load_image_file(image_path)# 检测人脸face_locations = face_recognition.face_locations(image)if len(face_locations) == 0:return None, "No face detected"# 提取人脸特征face_encodings = face_recognition.face_encodings(image, face_locations)# 使用最清晰的人脸(通常第一个)return face_encodings[0], Noneexcept Exception as e:return None, f"Error extracting face features: {e}"def extract_style_features(self, image_path):"""提取图像风格特征(颜色、纹理等)"""try:# 使用OpenCV提取颜色直方图作为风格特征image = cv2.imread(image_path)if image is None:return None, "Cannot read image"# 转换为HSV颜色空间hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)# 计算颜色直方图hist_h = cv2.calcHist([hsv], [0], None, [50], [0, 180])hist_s = cv2.calcHist([hsv], [1], None, [50], [0, 256])hist_v = cv2.calcHist([hsv], [2], None, [50], [0, 256])# 归一化直方图cv2.normalize(hist_h, hist_h, 0, 1, cv2.NORM_MINMAX)cv2.normalize(hist_s, hist_s, 0, 1, cv2.NORM_MINMAX)cv2.normalize(hist_v, hist_v, 0, 1, cv2.NORM_MINMAX)# 合并特征style_features = np.concatenate([hist_h.flatten(), hist_s.flatten(), hist_v.flatten()])return style_features, Noneexcept Exception as e:return None, f"Error extracting style features: {e}"def calculate_similarity(self, features1, features2, feature_type="face"):"""计算两个特征向量之间的相似度"""if features1 is None or features2 is None:return 0, "Invalid features"try:if feature_type == "face":# 对于人脸特征,使用余弦相似度similarity = 1 - np.linalg.norm(features1 - features2)else:# 对于风格特征,使用余弦相似度similarity = cosine_similarity(features1.reshape(1, -1), features2.reshape(1, -1))[0][0]return max(0, min(1, similarity)), Noneexcept Exception as e:return 0, f"Error calculating similarity: {e}"def find_most_similar_reference(self, user_image_path, min_similarity=0.8):"""在参考图库中查找与用户图像最相似的图像"""# 提取用户图像特征face_features, face_error = self.extract_face_features(user_image_path)style_features, style_error = self.extract_style_features(user_image_path)if face_features is None and style_features is None:return None, "Cannot extract any features from user image"best_match = Nonebest_similarity = 0best_combined_similarity = 0# 遍历参考图库for ref_id, ref_data in self.reference_features.items():face_sim = 0style_sim = 0weights = {"face": 0.7, "style": 0.3} # 权重分配# 计算人脸相似度(如果有)if face_features is not None and "face" in ref_data:face_sim, _ = self.calculate_similarity(face_features, ref_data["face"], "face")# 计算风格相似度(如果有)if style_features is not None and "style" in ref_data:style_sim, _ = self.calculate_similarity(style_features, ref_data["style"], "style")# 计算综合相似度if face_features is not None and style_features is not None:combined_sim = weights["face"] * face_sim + weights["style"] * style_simelif face_features is not None:combined_sim = face_simelse:combined_sim = style_sim# 更新最佳匹配if combined_sim > best_combined_similarity:best_combined_similarity = combined_simbest_similarity = combined_simbest_match = ref_id# 检查是否达到最小相似度要求if best_combined_similarity < min_similarity:return None, f"No reference image found with similarity >= {min_similarity}. Best was {best_combined_similarity:.2f}"return best_match, best_similarity
5. 提示词生成模块
5.1 提示词生成器
import json
import randomclass PromptGenerator:"""提示词生成类,根据参考图像和用户图像生成高质量的提示词"""def __init__(self, reference_db_path="reference_database.json"):self.reference_db = self.load_reference_database(reference_db_path)self.style_keywords = self.load_style_keywords()def load_reference_database(self, db_path):"""加载参考图像数据库"""try:with open(db_path, 'r', encoding='utf-8') as f:return json.load(f)except FileNotFoundError:return {}except Exception as e:print(f"Error loading reference database: {e}")return {}def load_style_keywords(self):"""加载风格关键词库"""# 这里可以扩展为从文件或API加载更丰富的关键词库return {"photography_styles": ["portrait photography", "fashion photography", "editorial photography","fine art photography", "documentary photography", "street photography"],"lighting_styles": ["natural lighting", "studio lighting", "soft lighting", "dramatic lighting","golden hour", "blue hour", "rim lighting", "backlighting"],"composition_styles": ["close-up", "medium shot", "full body", "head and shoulders","rule of thirds", "centered composition", "dynamic angle"],"mood_keywords": ["serene", "joyful", "mysterious", "confident", "thoughtful","powerful", "elegant", "playful", "romantic"],"detail_keywords": ["sharp focus", "bokeh", "high detail", "cinematic", "film grain","vintage", "modern", "minimalist", "textured"]}def generate_prompt(self, reference_id, user_features=None, similarity=0.8):"""生成基于参考图像的提示词"""if reference_id not in self.reference_db:return "A professional portrait photography", "Reference not found"ref_data = self.reference_db[reference_id]# 基础提示词部分base_prompt = ref_data.get("base_prompt", "A professional portrait")# 根据相似度调整提示词强度similarity_factor = min(1.0, similarity / 0.8) # 归一化到0-1.25范围# 添加风格元素style_prompt = self._add_style_elements(ref_data, similarity_factor)# 添加细节增强detail_prompt = self._add_detail_elements(similarity_factor)# 组合提示词full_prompt = f"{base_prompt}, {style_prompt}, {detail_prompt}"# 清理提示词(去除多余逗号等)full_prompt = self._clean_prompt(full_prompt)return full_prompt, Nonedef _add_style_elements(self, ref_data, similarity_factor):"""添加风格元素到提示词"""styles = []# 从参考数据中获取风格if "style" in ref_data:styles.append(ref_data["style"])# 添加随机风格元素以增加多样性if random.random() < 0.7 * similarity_factor: # 相似度越高,添加的风格越接近参考styles.append(random.choice(self.style_keywords["photography_styles"]))if random.random() < 0.6 * similarity_factor:styles.append(random.choice(self.style_keywords["lighting_styles"]))if random.random() < 0.5 * similarity_factor:styles.append(random.choice(self.style_keywords["composition_styles"]))if random.random() < 0.4 * similarity_factor:styles.append(random.choice(self.style_keywords["mood_keywords"]))return ", ".join(styles)def _add_detail_elements(self, similarity_factor):"""添加细节元素到提示词"""details = []# 根据相似度因子添加不同数量的细节num_details = int(3 * similarity_factor) + 1for _ in range(num_details):if random.random() < 0.8:details.append(random.choice(self.style_keywords["detail_keywords"]))return ", ".join(details)def _clean_prompt(self, prompt):"""清理提示词,去除多余的空格和逗号"""# 去除多余空格prompt = " ".join(prompt.split())# 去除重复逗号while ",," in prompt:prompt = prompt.replace(",,", ",")# 去除开头和结尾的逗号prompt = prompt.strip().strip(',')return prompt
6. 多样化图像生成模块
6.1 基于扩散模型的图像生成
import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
from PIL import Image
import numpy as npclass ImageGenerator:"""图像生成类,使用扩散模型生成多样化图像"""def __init__(self, model_id="stabilityai/stable-diffusion-2-1"):self.model_id = model_idself.device = "cuda" if torch.cuda.is_available() else "cpu"self.pipeline = Noneself.load_model()def load_model(self):"""加载预训练的扩散模型"""try:print(f"Loading model {self.model_id} on {self.device}...")# 使用DPMSolver加速采样self.pipeline = StableDiffusionPipeline.from_pretrained(self.model_id,torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,safety_checker=None, # 禁用安全检查以加速)self.pipeline.scheduler = DPMSolverMultistepScheduler.from_config(self.pipeline.scheduler.config)if self.device == "cuda":self.pipeline = self.pipeline.to("cuda")# 启用注意力切片以减少内存使用self.pipeline.enable_attention_slicing()print("Model loaded successfully")except Exception as e:print(f"Error loading model: {e}")self.pipeline = Nonedef generate_images(self, prompt, num_images=6, guidance_scale=7.5, num_inference_steps=25,seed=None, variations=None):"""生成多样化图像"""if self.pipeline is None:return None, "Model not loaded"try:# 设置随机种子(如果提供)if seed is not None:torch.manual_seed(seed)# 生成基础图像base_images = []base_seeds = []# 生成第一张图像generator = torch.Generator(device=self.device)if seed is not None:generator.manual_seed(seed)with torch.autocast("cuda" if self.device == "cuda" else "cpu"):result = self.pipeline(prompt,guidance_scale=guidance_scale,num_inference_steps=num_inference_steps,generator=generator,num_images_per_prompt=1)base_images.append(result.images[0])base_seeds.append(generator.initial_seed())# 生成变体图像variant_images = []variant_seeds = []for i in range(1, num_images):# 使用不同的种子生成变体variant_seed = seed + i * 1000 if seed is not None else Nonegenerator = torch.Generator(device=self.device)if variant_seed is not None:generator.manual_seed(variant_seed)# 可以稍微修改提示词以增加多样性variant_prompt = self._modify_prompt(prompt, i)with torch.autocast("cuda" if self.device == "cuda" else "cpu"):result = self.pipeline(variant_prompt,guidance_scale=guidance_scale,num_inference_steps=num_inference_steps,generator=generator,num_images_per_prompt=1)variant_images.append(result.images[0])variant_seeds.append(generator.initial_seed())# 合并所有图像all_images = base_images + variant_imagesall_seeds = base_seeds + variant_seedsreturn all_images, all_seeds, Noneexcept Exception as e:return None, None, f"Error generating images: {e}"def _modify_prompt(self, prompt, variation_index):"""轻微修改提示词以生成多样化图像"""# 定义一些姿势、角度和动作的变化variations = ["slightly different pose", "different angle", "slight head turn","subtle expression change", "slightly different lighting", "minor composition variation"]# 选择一种变化(循环使用)variation = variations[variation_index % len(variations)]# 以一定概率添加变化if random.random() < 0.7:return f"{prompt}, {variation}"else:return promptdef ensure_consistency(self, images, prompt, threshold=0.8):"""确保生成的图像在风格上保持一致"""# 这里可以实现一致性检查逻辑# 例如,使用特征提取和相似度计算来验证一致性# 简化实现:返回所有图像(实际应用中需要实现一致性检查)return images, [True] * len(images), "Consistency check not implemented"
7. 六宫格合成模块
7.1 六宫格合成器
from PIL import Image, ImageDraw, ImageFont
import mathclass GridCompositor:"""六宫格合成类,将6张图像合成为六宫格布局"""def __init__(self, grid_size=(3, 2), output_size=(1200, 800)):self.grid_size = grid_size # (cols, rows)self.output_size = output_sizeself.cell_padding = 10 # 单元格之间的间距def create_grid(self, images, background_color=(255, 255, 255)):"""创建六宫格图像"""if len(images) != 6:return None, f"Expected 6 images, got {len(images)}"try:# 计算每个单元格的大小total_width = self.output_size[0]total_height = self.output_size[1]# 减去边距和间距available_width = total_width - (self.grid_size[0] + 1) * self.cell_paddingavailable_height = total_height - (self.grid_size[1] + 1) * self.cell_paddingcell_width = available_width // self.grid_size[0]cell_height = available_height // self.grid_size[1]# 创建画布grid_image = Image.new('RGB', self.output_size, color=background_color)# 将图像放置到网格中for i, img in enumerate(images):# 调整图像大小以适应单元格img = self._resize_image(img, (cell_width, cell_height))# 计算位置row = i // self.grid_size[0]col = i % self.grid_size[0]x = self.cell_padding + col * (cell_width + self.cell_padding)y = self.cell_padding + row * (cell_height + self.cell_padding)# 粘贴图像grid_image.paste(img, (x, y))return grid_image, Noneexcept Exception as e:return None, f"Error creating grid: {e}"def _resize_image(self, image, target_size):"""调整图像大小,保持宽高比"""# 计算缩放比例original_width, original_height = image.sizetarget_width, target_height = target_size# 计算保持宽高比的缩放ratio = min(target_width / original_width, target_height / original_height)new_width = int(original_width * ratio)new_height = int(original_height * ratio)# 调整大小resized_image = image.resize((new_width, new_height), Image.Resampling.LANCZOS)# 创建目标大小的图像(居中放置)result_image = Image.new('RGB', target_size, (255, 255, 255))# 计算位置(居中)x = (target_width - new_width) // 2y = (target_height - new_height) // 2# 粘贴调整后的图像result_image.paste(resized_image, (x, y))return result_imagedef add_border_and_text(self, grid_image, title=None, border_width=5, border_color=(0, 0, 0)):"""添加边框和文本到六宫格图像"""try:# 添加边框width, height = grid_image.sizebordered_image = Image.new('RGB', (width + 2 * border_width, height + 2 * border_width), border_color)bordered_image.paste(grid_image, (border_width, border_width))# 如果有标题,添加标题if title:bordered_image = self._add_title(bordered_image, title)return bordered_image, Noneexcept Exception as e:return None, f"Error adding border and text: {e}"def _add_title(self, image, title):"""添加标题到图像"""try:# 创建一个稍大的画布来容纳标题width, height = image.sizetitle_height = 50titled_image = Image.new('RGB', (width, height + title_height), (255, 255, 255))# 粘贴原始图像titled_image.paste(image, (0, title_height))# 添加标题文本draw = ImageDraw.Draw(titled_image)# 尝试加载字体,失败时使用默认字体try:font = ImageFont.truetype("arial.ttf", 30)except IOError:font = ImageFont.load_default()# 计算文本位置(居中)text_width = draw.textlength(title, font=font)text_x = (width - text_width) // 2text_y = (title_height - 30) // 2# 绘制文本draw.text((text_x, text_y), title, fill=(0, 0, 0), font=font)return titled_imageexcept Exception as e:print(f"Error adding title: {e}")return image
8. 完整流程整合
8.1 主流程控制器
import os
import shutil
from datetime import datetime
import jsonclass PhotoAgent:"""主流程控制器,整合所有模块功能"""def __init__(self, config_path="config.json"):self.config = self.load_config(config_path)self.preprocessor = ImagePreprocessor()self.feature_extractor = FeatureExtractor()self.prompt_generator = PromptGenerator()self.image_generator = ImageGenerator()self.grid_compositor = GridCompositor()# 创建必要的目录self._create_directories()def load_config(self, config_path):"""加载配置文件"""default_config = {"upload_dir": "uploads","processed_dir": "processed","generated_dir": "generated","grids_dir": "grids","min_similarity": 0.8,"num_images": 6,"output_size": (1200, 800)}try:with open(config_path, 'r') as f:user_config = json.load(f)default_config.update(user_config)except FileNotFoundError:print(f"Config file {config_path} not found, using default config")return default_configdef _create_directories(self):"""创建必要的目录"""os.makedirs(self.config["upload_dir"], exist_ok=True)os.makedirs(self.config["processed_dir"], exist_ok=True)os.makedirs(self.config["generated_dir"], exist_ok=True)os.makedirs(self.config["grids_dir"], exist_ok=True)def process_user_photo(self, uploaded_file_path):"""处理用户上传的照片的完整流程"""# 生成唯一任务IDtask_id = datetime.now().strftime("%Y%m%d_%H%M%S") + "_" + os.path.basename(uploaded_file_path)task_id = task_id.replace(".", "_")print(f"Starting processing task: {task_id}")# 1. 预处理用户照片processed_path = os.path.join(self.config["processed_dir"], f"processed_{task_id}.jpg")processed_image, quality_score, issues = self.preprocessor.preprocess_image(uploaded_file_path, processed_path)if processed_image is None:return None, f"Failed to preprocess image: {issues}"print(f"Image preprocessed. Quality score: {quality_score}, Issues: {issues}")# 2. 查找最相似的参考图像reference_id, similarity = self.feature_extractor.find_most_similar_reference(processed_path, self.config["min_similarity"])if reference_id is None:return None, f"Failed to find similar reference: {similarity}"print(f"Found similar reference: {reference_id} with similarity: {similarity:.2f}")# 3. 生成提示词prompt, error = self.prompt_generator.generate_prompt(reference_id, None, similarity)if error:return None, f"Failed to generate prompt: {error}"print(f"Generated prompt: {prompt}")# 4. 生成多样化图像images, seeds, error = self.image_generator.generate_images(prompt, self.config["num_images"])if error:return None, f"Failed to generate images: {error}"print(f"Generated {len(images)} images")# 5. 保存生成的图像generated_paths = []for i, img in enumerate(images):img_path = os.path.join(self.config["generated_dir"], f"gen_{task_id}_{i}.jpg")img.save(img_path, quality=95)generated_paths.append(img_path)# 6. 创建六宫格grid_image, error = self.grid_compositor.create_grid(images, self.config["output_size"])if error:return None, f"Failed to create grid: {error}"# 添加标题和边框title = f"Photo Variations - {datetime.now().strftime('%Y-%m-%d %H:%M')}"final_grid, error = self.grid_compositor.add_border_and_text(grid_image, title)if error:return None, f"Failed to add border and text: {error}"# 保存六宫格grid_path = os.path.join(self.config["grids_dir"], f"grid_{task_id}.jpg")final_grid.save(grid_path, quality=95)print(f"Grid saved to: {grid_path}")# 7. 返回结果result = {"task_id": task_id,"original_image": uploaded_file_path,"processed_image": processed_path,"reference_id": reference_id,"similarity": similarity,"prompt": prompt,"generated_images": generated_paths,"grid_image": grid_path,"quality_score": quality_score,"issues": issues,"seeds": seeds}# 保存任务结果元数据metadata_path = os.path.join(self.config["grids_dir"], f"metadata_{task_id}.json")with open(metadata_path, 'w') as f:json.dump(result, f, indent=2)return result, Nonedef cleanup_task(self, task_id, keep_grid=True):"""清理任务产生的临时文件"""try:# 删除处理后的图像processed_pattern = os.path.join(self.config["processed_dir"], f"*{task_id}*")for f in glob.glob(processed_pattern):os.remove(f)# 删除生成的图像generated_pattern = os.path.join(self.config["generated_dir"], f"*{task_id}*")for f in glob.glob(generated_pattern):os.remove(f)# 删除元数据文件metadata_pattern = os.path.join(self.config["grids_dir"], f"metadata_{task_id}.json")if os.path.exists(metadata_pattern):os.remove(metadata_pattern)# 可选:删除六宫格图像if not keep_grid:grid_pattern = os.path.join(self.config["grids_dir"], f"grid_{task_id}.jpg")if os.path.exists(grid_pattern):os.remove(grid_pattern)print(f"Cleaned up files for task: {task_id}")return True, Noneexcept Exception as e:return False, f"Error during cleanup: {e}"
8.2 Web API接口
from flask_restful import Api, Resource
import uuid# 初始化Flask应用和API
app = Flask(__name__)
api = Api(app)# 全局PhotoAgent实例
photo_agent = PhotoAgent()class UploadPhoto(Resource):def post(self):# 文件上传逻辑(前面已实现)passclass ProcessPhoto(Resource):def post(self):# 获取上传的文件信息data = request.get_json()if 'filepath' not in data:return {'error': 'No filepath provided'}, 400filepath = data['filepath']# 处理照片result, error = photo_agent.process_user_photo(filepath)if error:return {'error': error}, 500return {'message': 'Processing completed', 'result': result}, 200class GetResult(Resource):def get(self, task_id):# 检查任务结果metadata_path = os.path.join(photo_agent.config["grids_dir"], f"metadata_{task_id}.json")if not os.path.exists(metadata_path):return {'error': 'Task not found'}, 404with open(metadata_path, 'r') as f:result = json.load(f)return {'result': result}, 200class DownloadGrid(Resource):def get(self, task_id):# 提供六宫格下载grid_path = os.path.join(photo_agent.config["grids_dir"], f"grid_{task_id}.jpg")if not os.path.exists(grid_path):return {'error': 'Grid not found'}, 404return send_file(grid_path, as_attachment=True)# 注册API路由
api.add_resource(UploadPhoto, '/upload')
api.add_resource(ProcessPhoto, '/process')
api.add_resource(GetResult, '/result/<string:task_id>')
api.add_resource(DownloadGrid, '/download/<string:task_id>')if __name__ == '__main__':app.run(debug=True, host='0.0.0.0', port=5000)
9. 性能优化与错误处理
9.1 性能优化策略
class PerformanceOptimizer:"""性能优化类,提供各种优化策略"""def __init__(self):self.cache = {}def enable_model_caching(self, pipeline):"""启用模型缓存以减少加载时间"""# 这里可以实现模型缓存逻辑passdef optimize_memory_usage(self):"""优化内存使用"""# 清理GPU缓存if torch.cuda.is_available():torch.cuda.empty_cache()# 清理Python垃圾收集import gcgc.collect()def batch_processing(self, tasks):"""批量处理任务以提高效率"""# 这里可以实现批量处理逻辑passdef adaptive_quality_settings(self, image_size, complexity):"""根据图像大小和复杂度自适应调整质量设置"""# 较小的图像或较低复杂度可以使用较少的推理步骤if image_size[0] * image_size[1] < 500000: # 小于0.5MPreturn {"num_inference_steps": 20, "guidance_scale": 7.0}elif complexity == "low":return {"num_inference_steps": 22, "guidance_scale": 7.2}else:return {"num_inference_steps": 25, "guidance_scale": 7.5}
9.2 错误处理与日志记录
import logging
from logging.handlers import RotatingFileHandler
import tracebackdef setup_logging():"""设置日志记录"""# 创建日志目录os.makedirs("logs", exist_ok=True)# 配置根日志记录器logging.basicConfig(level=logging.INFO,format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',handlers=[RotatingFileHandler("logs/photo_agent.log", maxBytes=10*1024*1024, backupCount=5),logging.StreamHandler()])class ErrorHandler:"""错误处理类,提供统一的错误处理机制"""def __init__(self):self.logger = logging.getLogger(__name__)def handle_exception(self, e, context=""):"""统一处理异常"""error_msg = f"Error in {context}: {str(e)}"self.logger.error(error_msg)self.logger.error(traceback.format_exc())return error_msgdef validate_image_file(self, filepath):"""验证图像文件是否有效"""try:with Image.open(filepath) as img:img.verify()return True, Noneexcept Exception as e:return False, f"Invalid image file: {e}"def check_disk_space(self, required_mb=100):"""检查磁盘空间是否足够"""try:stat = os.statvfs('.')free_space = stat.f_bavail * stat.f_frsize / 1024 / 1024 # MBreturn free_space >= required_mb, free_spaceexcept Exception as e:return False, f"Error checking disk space: {e}"
10. 测试与验证
10.1 单元测试
import unittest
from unittest.mock import Mock, patchclass TestPhotoAgent(unittest.TestCase):"""PhotoAgent单元测试类"""def setUp(self):"""设置测试环境"""self.agent = PhotoAgent()self.test_image_path = "test_data/test_photo.jpg"def test_image_preprocessing(self):"""测试图像预处理"""result, quality, issues = self.agent.preprocessor.preprocess_image(self.test_image_path)self.assertIsNotNone(result)self.assertIsInstance(quality, (int, float))self.assertIsInstance(issues, list)@patch.object(FeatureExtractor, 'find_most_similar_reference')def test_reference_matching(self, mock_find):"""测试参考图像匹配"""# 模拟返回结果mock_find.return_value = ("ref_123", 0.85)ref_id, similarity = self.agent.feature_extractor.find_most_similar_reference(self.test_image_path)self.assertEqual(ref_id, "ref_123")self.assertAlmostEqual(similarity, 0.85, places=2)def test_prompt_generation(self):"""测试提示词生成"""prompt, error = self.agent.prompt_generator.generate_prompt("ref_123", None, 0.85)self.assertIsNotNone(prompt)self.assertIsInstance(prompt, str)self.assertGreater(len(prompt), 10)self.assertIsNone(error)@patch.object(ImageGenerator, 'generate_images')def test_image_generation(self, mock_generate):"""测试图像生成"""# 模拟返回结果mock_images = [Image.new('RGB', (100, 100)) for _ in range(6)]mock_seeds = [123, 456, 789, 101112, 131415, 161718]mock_generate.return_value = (mock_images, mock_seeds, None)images, seeds, error = self.agent.image_generator.generate_images("test prompt", 6)self.assertEqual(len(images), 6)self.assertEqual(len(seeds), 6)self.assertIsNone(error)def test_grid_creation(self):"""测试六宫格创建"""test_images = [Image.new('RGB', (200, 200)) for _ in range(6)]grid, error = self.agent.grid_compositor.create_grid(test_images)self.assertIsNotNone(grid)self.assertIsNone(error)self.assertEqual(grid.size, self.agent.grid_compositor.output_size)if __name__ == '__main__':# 创建测试数据目录os.makedirs("test_data", exist_ok=True)# 运行测试unittest.main()
10.2 集成测试
class IntegrationTest(unittest.TestCase):"""集成测试类,测试完整流程"""def setUp(self):self.agent = PhotoAgent()# 创建测试图像self.test_image = Image.new('RGB', (400, 400), color=(73, 109, 137))self.test_path = "test_data/test_integration.jpg"self.test_image.save(self.test_path)@patch.multiple(FeatureExtractor, find_most_similar_reference=Mock(return_value=("ref_123", 0.85)))@patch.multiple(ImageGenerator, generate_images=Mock(return_value=([Image.new('RGB', (200, 200)) for _ in range(6)], [123, 456, 789, 101112, 131415, 161718], None)))def test_full_process(self):"""测试完整处理流程"""result, error = self.agent.process_user_photo(self.test_path)self.assertIsNotNone(result)self.assertIsNone(error)self.assertEqual(result['reference_id'], "ref_123")self.assertAlmostEqual(result['similarity'], 0.85, places=2)self.assertEqual(len(result['generated_images']), 6)self.assertIn('grid_image', result)def tearDown(self):# 清理测试文件if os.path.exists(self.test_path):os.remove(self.test_path)# 清理可能生成的文件for f in glob.glob("processed/*test_integration*"):os.remove(f)for f in glob.glob("generated/*test_integration*"):os.remove(f)for f in glob.glob("grids/*test_integration*"):os.remove(f)
11. 部署方案
11.1 Docker容器化部署
# Dockerfile
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime# 设置工作目录
WORKDIR /app# 安装系统依赖
RUN apt-get update && apt-get install -y \libgl1 \libglib2.0-0 \&& rm -rf /var/lib/apt/lists/*# 复制requirements文件
COPY requirements.txt .# 安装Python依赖
RUN pip install --no-cache-dir -r requirements.txt# 复制应用代码
COPY . .# 创建必要的目录
RUN mkdir -p uploads processed generated grids logs# 暴露端口
EXPOSE 5000# 设置环境变量
ENV FLASK_APP=app.py
ENV FLASK_ENV=production# 启动应用
CMD ["flask", "run", "--host=0.0.0.0", "--port=5000"]
11.2 Kubernetes部署配置
# photo-agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:name: photo-agent
spec:replicas: 2selector:matchLabels:app: photo-agenttemplate:metadata:labels:app: photo-agentspec:containers:- name: photo-agentimage: photo-agent:latestports:- containerPort: 5000resources:requests:memory: "4Gi"cpu: "1000m"nvidia.com/gpu: 1limits:memory: "8Gi"cpu: "2000m"nvidia.com/gpu: 1volumeMounts:- name: storagemountPath: /app/uploads- name: storagemountPath: /app/processed- name: storagemountPath: /app/generated- name: storagemountPath: /app/gridsvolumes:- name: storagepersistentVolumeClaim:claimName: photo-agent-storage
---
apiVersion: v1
kind: Service
metadata:name: photo-agent-service
spec:selector:app: photo-agentports:- protocol: TCPport: 80targetPort: 5000type: LoadBalancer
12. 总结与展望
本文详细介绍了基于Python的智能体流程开发,实现了从用户自拍照片上传到六宫格图像生成的完整功能。系统采用模块化设计,包括图像预处理、特征提取、提示词生成、图像生成和六宫格合成等核心模块。
12.1 技术总结
- 图像处理:使用OpenCV和PIL库进行图像质量评估和预处理
- 特征提取:结合人脸识别和风格特征提取技术实现相似度计算
- 提示词工程:基于参考图像和相似度生成高质量的文本提示
- 图像生成:利用Stable Diffusion等扩散模型生成多样化图像
- 结果合成:将生成的图像合成为六宫格布局并提供下载
12.2 未来改进方向
- 模型优化:探索更高效的图像生成模型,减少计算资源需求
- 个性化推荐:基于用户历史偏好优化提示词生成和图像风格
- 实时处理:优化流程实现近实时图像生成
- 移动端支持:开发移动应用,提供更便捷的用户体验
- 多模态支持:扩展支持视频输入和3D模型生成
12.3 实际应用价值
本系统具有广泛的应用前景,包括:
- 个人摄影爱好者创建多样化的肖像照片
- 电子商务平台生成商品模特的多角度展示
- 社交媒体用户创建个性化的头像集合
- 艺术创作和设计领域的灵感激发
通过持续优化和扩展,该系统有望成为数字内容创作领域的重要工具,为用户提供高效、高质量的图像生成服务。
注意:本文提供的代码示例需要根据实际环境和需求进行调整。特别是模型加载和推理部分,需要确保有足够的硬件资源(尤其是GPU)支持。此外,参考图像数据库的构建和管理也是实际应用中需要重点考虑的部分。