当前位置: 首页 > news >正文

智能体流程:自拍照片处理与六宫格图像生成

智能体流程:自拍照片处理与六宫格图像生成

概述

本文将详细介绍一个完整的智能体流程,用于处理用户自拍照片并生成六宫格图像。该系统主要包括以下功能模块:

  1. 用户自拍照片上传与预处理
  2. 基于参考图像生成高度相似的关键词(相似度80%以上)
  3. 生成6张风格一致但姿势/角度/动作不同的新图像
  4. 将生成的图像合成为六宫格实景示例
  5. 保存结果到指定文件夹

本文将使用Python作为主要开发语言,结合多种计算机视觉和深度学习库实现上述功能。

目录

  1. 系统架构设计
  2. 环境配置与依赖库
  3. 用户自拍照片上传与预处理模块
  4. 图像特征提取与相似度计算
  5. 提示词生成模块
  6. 多样化图像生成模块
  7. 六宫格合成模块
  8. 完整流程整合
  9. 性能优化与错误处理
  10. 测试与验证
  11. 部署方案
  12. 总结与展望

1. 系统架构设计

本系统采用模块化设计,各个功能模块相对独立但又协同工作。整体架构如下:

用户界面层│▼
API接口层(RESTful API)│▼
业务逻辑层├── 图像上传与预处理模块├── 特征提取与相似度计算模块├── 提示词生成模块├── 多样化图像生成模块└── 六宫格合成模块│▼
数据存储层├── 原始图像存储├── 处理中间结果存储└── 最终结果存储

系统工作流程:

  1. 用户通过Web界面上传自拍照片
  2. 系统对照片进行预处理和质量评估
  3. 提取图像特征并与参考图库进行相似度比对
  4. 生成高度相似的提示词
  5. 使用提示词生成6张风格一致但不同的图像
  6. 将6张图像合成为六宫格
  7. 保存结果并提供下载链接

2. 环境配置与依赖库

2.1 所需Python库

# 核心库
python >= 3.8# 图像处理
opencv-python >= 4.5.0
Pillow >= 8.0.0
scikit-image >= 0.18.0# 深度学习框架
torch >= 1.8.0
torchvision >= 0.9.0
transformers >= 4.0.0
diffusers >= 0.10.0# 特征提取与相似度
face-recognition >= 1.3.0
scikit-learn >= 0.24.0# 工具库
numpy >= 1.20.0
matplotlib >= 3.3.0
seaborn >= 0.11.0# Web框架
flask >= 2.0.0
flask-restful >= 0.3.0# 文件处理
werkzeug >= 2.0.0
python-multipart >= 0.0.5

2.2 环境安装脚本

#!/bin/bash
# 创建虚拟环境
python -m venv photo_agent_env
source photo_agent_env/bin/activate# 安装基础依赖
pip install --upgrade pip
pip install opencv-python Pillow scikit-image
pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113  # CUDA 11.3
pip install transformers diffusers
pip install face-recognition scikit-learn
pip install numpy matplotlib seaborn
pip install flask flask-restful werkzeug python-multipart# 创建必要的目录结构
mkdir -p uploads processed references generated grids logs

3. 用户自拍照片上传与预处理模块

3.1 照片上传API

from flask import Flask, request, jsonify
from werkzeug.utils import secure_filename
import os
import uuid
from datetime import datetimeapp = Flask(__name__)# 配置
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024  # 16MB限制
app.config['UPLOAD_FOLDER'] = 'uploads/'
app.config['ALLOWED_EXTENSIONS'] = {'png', 'jpg', 'jpeg', 'bmp', 'tiff'}def allowed_file(filename):return '.' in filename and \filename.rsplit('.', 1)[1].lower() in app.config['ALLOWED_EXTENSIONS']@app.route('/upload', methods=['POST'])
def upload_file():"""上传用户自拍照片的API端点"""# 检查是否有文件部分if 'file' not in request.files:return jsonify({'error': 'No file part'}), 400file = request.files['file']# 检查是否选择了文件if file.filename == '':return jsonify({'error': 'No selected file'}), 400# 检查文件类型if file and allowed_file(file.filename):# 生成唯一文件名filename = secure_filename(file.filename)unique_filename = f"{uuid.uuid4().hex}_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{filename}"filepath = os.path.join(app.config['UPLOAD_FOLDER'], unique_filename)# 保存文件file.save(filepath)# 返回文件信息return jsonify({'message': 'File uploaded successfully','filename': unique_filename,'filepath': filepath}), 200else:return jsonify({'error': 'File type not allowed'}), 400

3.2 图像预处理类

import cv2
import numpy as np
from PIL import Image, ImageEnhance, ImageFilter
import loggingclass ImagePreprocessor:"""图像预处理类,负责对上传的图像进行质量检查和预处理"""def __init__(self):self.logger = logging.getLogger(__name__)def load_image(self, image_path):"""加载图像文件"""try:image = Image.open(image_path)return imageexcept Exception as e:self.logger.error(f"Error loading image: {e}")return Nonedef check_image_quality(self, image):"""检查图像质量,返回质量评分和问题列表"""quality_score = 100  # 初始分数issues = []# 转换为OpenCV格式进行处理cv_image = np.array(image.convert('RGB'))cv_image = cv_image[:, :, ::-1].copy()  # RGB to BGR# 检查图像尺寸height, width = cv_image.shape[:2]if height < 300 or width < 300:quality_score -= 30issues.append("Image dimensions too small")# 检查图像模糊度blur_value = cv2.Laplacian(cv_image, cv2.CV_64F).var()if blur_value < 100:quality_score -= 30issues.append("Image is too blurry")# 检查亮度hsv = cv2.cvtColor(cv_image, cv2.COLOR_BGR2HSV)brightness = np.mean(hsv[:,:,2])if brightness < 50:quality_score -= 20issues.append("Image is too dark")elif brightness > 200:quality_score -= 20issues.append("Image is overexposed")# 检查对比度contrast = np.std(cv_image)if contrast < 40:quality_score -= 20issues.append("Low contrast")return max(quality_score, 0), issuesdef preprocess_image(self, image_path, output_path=None):"""对图像进行预处理,包括尺寸调整、增强等"""# 加载图像image = self.load_image(image_path)if image is None:return None# 检查质量quality_score, issues = self.check_image_quality(image)self.logger.info(f"Image quality score: {quality_score}, Issues: {issues}")# 如果质量低于阈值,尝试增强if quality_score < 70:self.logger.info("Attempting to enhance low quality image")image = self.enhance_image(image)# 调整尺寸(保持宽高比)max_size = (1024, 1024)image.thumbnail(max_size, Image.Resampling.LANCZOS)# 保存处理后的图像if output_path:image.save(output_path, quality=95)return image, quality_score, issuesdef enhance_image(self, image):"""增强图像质量"""# 增强对比度enhancer = ImageEnhance.Contrast(image)image = enhancer.enhance(1.2)# 增强锐度enhancer = ImageEnhance.Sharpness(image)image = enhancer.enhance(1.1)# 轻微降噪image = image.filter(ImageFilter.MedianFilter(3))return image

4. 图像特征提取与相似度计算

4.1 特征提取模块

import face_recognition
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import PCA
import numpy as np
import osclass FeatureExtractor:"""特征提取类,用于从图像中提取人脸和风格特征"""def __init__(self):self.reference_features = self.load_reference_features()def load_reference_features(self):"""加载参考图像的特征"""# 这里应该是从数据库或文件中加载预计算的特征# 简化示例,实际应用中需要实现完整的参考图库管理return {}def extract_face_features(self, image_path):"""提取人脸特征"""try:# 加载图像image = face_recognition.load_image_file(image_path)# 检测人脸face_locations = face_recognition.face_locations(image)if len(face_locations) == 0:return None, "No face detected"# 提取人脸特征face_encodings = face_recognition.face_encodings(image, face_locations)# 使用最清晰的人脸(通常第一个)return face_encodings[0], Noneexcept Exception as e:return None, f"Error extracting face features: {e}"def extract_style_features(self, image_path):"""提取图像风格特征(颜色、纹理等)"""try:# 使用OpenCV提取颜色直方图作为风格特征image = cv2.imread(image_path)if image is None:return None, "Cannot read image"# 转换为HSV颜色空间hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)# 计算颜色直方图hist_h = cv2.calcHist([hsv], [0], None, [50], [0, 180])hist_s = cv2.calcHist([hsv], [1], None, [50], [0, 256])hist_v = cv2.calcHist([hsv], [2], None, [50], [0, 256])# 归一化直方图cv2.normalize(hist_h, hist_h, 0, 1, cv2.NORM_MINMAX)cv2.normalize(hist_s, hist_s, 0, 1, cv2.NORM_MINMAX)cv2.normalize(hist_v, hist_v, 0, 1, cv2.NORM_MINMAX)# 合并特征style_features = np.concatenate([hist_h.flatten(), hist_s.flatten(), hist_v.flatten()])return style_features, Noneexcept Exception as e:return None, f"Error extracting style features: {e}"def calculate_similarity(self, features1, features2, feature_type="face"):"""计算两个特征向量之间的相似度"""if features1 is None or features2 is None:return 0, "Invalid features"try:if feature_type == "face":# 对于人脸特征,使用余弦相似度similarity = 1 - np.linalg.norm(features1 - features2)else:# 对于风格特征,使用余弦相似度similarity = cosine_similarity(features1.reshape(1, -1), features2.reshape(1, -1))[0][0]return max(0, min(1, similarity)), Noneexcept Exception as e:return 0, f"Error calculating similarity: {e}"def find_most_similar_reference(self, user_image_path, min_similarity=0.8):"""在参考图库中查找与用户图像最相似的图像"""# 提取用户图像特征face_features, face_error = self.extract_face_features(user_image_path)style_features, style_error = self.extract_style_features(user_image_path)if face_features is None and style_features is None:return None, "Cannot extract any features from user image"best_match = Nonebest_similarity = 0best_combined_similarity = 0# 遍历参考图库for ref_id, ref_data in self.reference_features.items():face_sim = 0style_sim = 0weights = {"face": 0.7, "style": 0.3}  # 权重分配# 计算人脸相似度(如果有)if face_features is not None and "face" in ref_data:face_sim, _ = self.calculate_similarity(face_features, ref_data["face"], "face")# 计算风格相似度(如果有)if style_features is not None and "style" in ref_data:style_sim, _ = self.calculate_similarity(style_features, ref_data["style"], "style")# 计算综合相似度if face_features is not None and style_features is not None:combined_sim = weights["face"] * face_sim + weights["style"] * style_simelif face_features is not None:combined_sim = face_simelse:combined_sim = style_sim# 更新最佳匹配if combined_sim > best_combined_similarity:best_combined_similarity = combined_simbest_similarity = combined_simbest_match = ref_id# 检查是否达到最小相似度要求if best_combined_similarity < min_similarity:return None, f"No reference image found with similarity >= {min_similarity}. Best was {best_combined_similarity:.2f}"return best_match, best_similarity

5. 提示词生成模块

5.1 提示词生成器

import json
import randomclass PromptGenerator:"""提示词生成类,根据参考图像和用户图像生成高质量的提示词"""def __init__(self, reference_db_path="reference_database.json"):self.reference_db = self.load_reference_database(reference_db_path)self.style_keywords = self.load_style_keywords()def load_reference_database(self, db_path):"""加载参考图像数据库"""try:with open(db_path, 'r', encoding='utf-8') as f:return json.load(f)except FileNotFoundError:return {}except Exception as e:print(f"Error loading reference database: {e}")return {}def load_style_keywords(self):"""加载风格关键词库"""# 这里可以扩展为从文件或API加载更丰富的关键词库return {"photography_styles": ["portrait photography", "fashion photography", "editorial photography","fine art photography", "documentary photography", "street photography"],"lighting_styles": ["natural lighting", "studio lighting", "soft lighting", "dramatic lighting","golden hour", "blue hour", "rim lighting", "backlighting"],"composition_styles": ["close-up", "medium shot", "full body", "head and shoulders","rule of thirds", "centered composition", "dynamic angle"],"mood_keywords": ["serene", "joyful", "mysterious", "confident", "thoughtful","powerful", "elegant", "playful", "romantic"],"detail_keywords": ["sharp focus", "bokeh", "high detail", "cinematic", "film grain","vintage", "modern", "minimalist", "textured"]}def generate_prompt(self, reference_id, user_features=None, similarity=0.8):"""生成基于参考图像的提示词"""if reference_id not in self.reference_db:return "A professional portrait photography", "Reference not found"ref_data = self.reference_db[reference_id]# 基础提示词部分base_prompt = ref_data.get("base_prompt", "A professional portrait")# 根据相似度调整提示词强度similarity_factor = min(1.0, similarity / 0.8)  # 归一化到0-1.25范围# 添加风格元素style_prompt = self._add_style_elements(ref_data, similarity_factor)# 添加细节增强detail_prompt = self._add_detail_elements(similarity_factor)# 组合提示词full_prompt = f"{base_prompt}, {style_prompt}, {detail_prompt}"# 清理提示词(去除多余逗号等)full_prompt = self._clean_prompt(full_prompt)return full_prompt, Nonedef _add_style_elements(self, ref_data, similarity_factor):"""添加风格元素到提示词"""styles = []# 从参考数据中获取风格if "style" in ref_data:styles.append(ref_data["style"])# 添加随机风格元素以增加多样性if random.random() < 0.7 * similarity_factor:  # 相似度越高,添加的风格越接近参考styles.append(random.choice(self.style_keywords["photography_styles"]))if random.random() < 0.6 * similarity_factor:styles.append(random.choice(self.style_keywords["lighting_styles"]))if random.random() < 0.5 * similarity_factor:styles.append(random.choice(self.style_keywords["composition_styles"]))if random.random() < 0.4 * similarity_factor:styles.append(random.choice(self.style_keywords["mood_keywords"]))return ", ".join(styles)def _add_detail_elements(self, similarity_factor):"""添加细节元素到提示词"""details = []# 根据相似度因子添加不同数量的细节num_details = int(3 * similarity_factor) + 1for _ in range(num_details):if random.random() < 0.8:details.append(random.choice(self.style_keywords["detail_keywords"]))return ", ".join(details)def _clean_prompt(self, prompt):"""清理提示词,去除多余的空格和逗号"""# 去除多余空格prompt = " ".join(prompt.split())# 去除重复逗号while ",," in prompt:prompt = prompt.replace(",,", ",")# 去除开头和结尾的逗号prompt = prompt.strip().strip(',')return prompt

6. 多样化图像生成模块

6.1 基于扩散模型的图像生成

import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
from PIL import Image
import numpy as npclass ImageGenerator:"""图像生成类,使用扩散模型生成多样化图像"""def __init__(self, model_id="stabilityai/stable-diffusion-2-1"):self.model_id = model_idself.device = "cuda" if torch.cuda.is_available() else "cpu"self.pipeline = Noneself.load_model()def load_model(self):"""加载预训练的扩散模型"""try:print(f"Loading model {self.model_id} on {self.device}...")# 使用DPMSolver加速采样self.pipeline = StableDiffusionPipeline.from_pretrained(self.model_id,torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,safety_checker=None,  # 禁用安全检查以加速)self.pipeline.scheduler = DPMSolverMultistepScheduler.from_config(self.pipeline.scheduler.config)if self.device == "cuda":self.pipeline = self.pipeline.to("cuda")# 启用注意力切片以减少内存使用self.pipeline.enable_attention_slicing()print("Model loaded successfully")except Exception as e:print(f"Error loading model: {e}")self.pipeline = Nonedef generate_images(self, prompt, num_images=6, guidance_scale=7.5, num_inference_steps=25,seed=None, variations=None):"""生成多样化图像"""if self.pipeline is None:return None, "Model not loaded"try:# 设置随机种子(如果提供)if seed is not None:torch.manual_seed(seed)# 生成基础图像base_images = []base_seeds = []# 生成第一张图像generator = torch.Generator(device=self.device)if seed is not None:generator.manual_seed(seed)with torch.autocast("cuda" if self.device == "cuda" else "cpu"):result = self.pipeline(prompt,guidance_scale=guidance_scale,num_inference_steps=num_inference_steps,generator=generator,num_images_per_prompt=1)base_images.append(result.images[0])base_seeds.append(generator.initial_seed())# 生成变体图像variant_images = []variant_seeds = []for i in range(1, num_images):# 使用不同的种子生成变体variant_seed = seed + i * 1000 if seed is not None else Nonegenerator = torch.Generator(device=self.device)if variant_seed is not None:generator.manual_seed(variant_seed)# 可以稍微修改提示词以增加多样性variant_prompt = self._modify_prompt(prompt, i)with torch.autocast("cuda" if self.device == "cuda" else "cpu"):result = self.pipeline(variant_prompt,guidance_scale=guidance_scale,num_inference_steps=num_inference_steps,generator=generator,num_images_per_prompt=1)variant_images.append(result.images[0])variant_seeds.append(generator.initial_seed())# 合并所有图像all_images = base_images + variant_imagesall_seeds = base_seeds + variant_seedsreturn all_images, all_seeds, Noneexcept Exception as e:return None, None, f"Error generating images: {e}"def _modify_prompt(self, prompt, variation_index):"""轻微修改提示词以生成多样化图像"""# 定义一些姿势、角度和动作的变化variations = ["slightly different pose", "different angle", "slight head turn","subtle expression change", "slightly different lighting", "minor composition variation"]# 选择一种变化(循环使用)variation = variations[variation_index % len(variations)]# 以一定概率添加变化if random.random() < 0.7:return f"{prompt}, {variation}"else:return promptdef ensure_consistency(self, images, prompt, threshold=0.8):"""确保生成的图像在风格上保持一致"""# 这里可以实现一致性检查逻辑# 例如,使用特征提取和相似度计算来验证一致性# 简化实现:返回所有图像(实际应用中需要实现一致性检查)return images, [True] * len(images), "Consistency check not implemented"

7. 六宫格合成模块

7.1 六宫格合成器

from PIL import Image, ImageDraw, ImageFont
import mathclass GridCompositor:"""六宫格合成类,将6张图像合成为六宫格布局"""def __init__(self, grid_size=(3, 2), output_size=(1200, 800)):self.grid_size = grid_size  # (cols, rows)self.output_size = output_sizeself.cell_padding = 10  # 单元格之间的间距def create_grid(self, images, background_color=(255, 255, 255)):"""创建六宫格图像"""if len(images) != 6:return None, f"Expected 6 images, got {len(images)}"try:# 计算每个单元格的大小total_width = self.output_size[0]total_height = self.output_size[1]# 减去边距和间距available_width = total_width - (self.grid_size[0] + 1) * self.cell_paddingavailable_height = total_height - (self.grid_size[1] + 1) * self.cell_paddingcell_width = available_width // self.grid_size[0]cell_height = available_height // self.grid_size[1]# 创建画布grid_image = Image.new('RGB', self.output_size, color=background_color)# 将图像放置到网格中for i, img in enumerate(images):# 调整图像大小以适应单元格img = self._resize_image(img, (cell_width, cell_height))# 计算位置row = i // self.grid_size[0]col = i % self.grid_size[0]x = self.cell_padding + col * (cell_width + self.cell_padding)y = self.cell_padding + row * (cell_height + self.cell_padding)# 粘贴图像grid_image.paste(img, (x, y))return grid_image, Noneexcept Exception as e:return None, f"Error creating grid: {e}"def _resize_image(self, image, target_size):"""调整图像大小,保持宽高比"""# 计算缩放比例original_width, original_height = image.sizetarget_width, target_height = target_size# 计算保持宽高比的缩放ratio = min(target_width / original_width, target_height / original_height)new_width = int(original_width * ratio)new_height = int(original_height * ratio)# 调整大小resized_image = image.resize((new_width, new_height), Image.Resampling.LANCZOS)# 创建目标大小的图像(居中放置)result_image = Image.new('RGB', target_size, (255, 255, 255))# 计算位置(居中)x = (target_width - new_width) // 2y = (target_height - new_height) // 2# 粘贴调整后的图像result_image.paste(resized_image, (x, y))return result_imagedef add_border_and_text(self, grid_image, title=None, border_width=5, border_color=(0, 0, 0)):"""添加边框和文本到六宫格图像"""try:# 添加边框width, height = grid_image.sizebordered_image = Image.new('RGB', (width + 2 * border_width, height + 2 * border_width), border_color)bordered_image.paste(grid_image, (border_width, border_width))# 如果有标题,添加标题if title:bordered_image = self._add_title(bordered_image, title)return bordered_image, Noneexcept Exception as e:return None, f"Error adding border and text: {e}"def _add_title(self, image, title):"""添加标题到图像"""try:# 创建一个稍大的画布来容纳标题width, height = image.sizetitle_height = 50titled_image = Image.new('RGB', (width, height + title_height), (255, 255, 255))# 粘贴原始图像titled_image.paste(image, (0, title_height))# 添加标题文本draw = ImageDraw.Draw(titled_image)# 尝试加载字体,失败时使用默认字体try:font = ImageFont.truetype("arial.ttf", 30)except IOError:font = ImageFont.load_default()# 计算文本位置(居中)text_width = draw.textlength(title, font=font)text_x = (width - text_width) // 2text_y = (title_height - 30) // 2# 绘制文本draw.text((text_x, text_y), title, fill=(0, 0, 0), font=font)return titled_imageexcept Exception as e:print(f"Error adding title: {e}")return image

8. 完整流程整合

8.1 主流程控制器

import os
import shutil
from datetime import datetime
import jsonclass PhotoAgent:"""主流程控制器,整合所有模块功能"""def __init__(self, config_path="config.json"):self.config = self.load_config(config_path)self.preprocessor = ImagePreprocessor()self.feature_extractor = FeatureExtractor()self.prompt_generator = PromptGenerator()self.image_generator = ImageGenerator()self.grid_compositor = GridCompositor()# 创建必要的目录self._create_directories()def load_config(self, config_path):"""加载配置文件"""default_config = {"upload_dir": "uploads","processed_dir": "processed","generated_dir": "generated","grids_dir": "grids","min_similarity": 0.8,"num_images": 6,"output_size": (1200, 800)}try:with open(config_path, 'r') as f:user_config = json.load(f)default_config.update(user_config)except FileNotFoundError:print(f"Config file {config_path} not found, using default config")return default_configdef _create_directories(self):"""创建必要的目录"""os.makedirs(self.config["upload_dir"], exist_ok=True)os.makedirs(self.config["processed_dir"], exist_ok=True)os.makedirs(self.config["generated_dir"], exist_ok=True)os.makedirs(self.config["grids_dir"], exist_ok=True)def process_user_photo(self, uploaded_file_path):"""处理用户上传的照片的完整流程"""# 生成唯一任务IDtask_id = datetime.now().strftime("%Y%m%d_%H%M%S") + "_" + os.path.basename(uploaded_file_path)task_id = task_id.replace(".", "_")print(f"Starting processing task: {task_id}")# 1. 预处理用户照片processed_path = os.path.join(self.config["processed_dir"], f"processed_{task_id}.jpg")processed_image, quality_score, issues = self.preprocessor.preprocess_image(uploaded_file_path, processed_path)if processed_image is None:return None, f"Failed to preprocess image: {issues}"print(f"Image preprocessed. Quality score: {quality_score}, Issues: {issues}")# 2. 查找最相似的参考图像reference_id, similarity = self.feature_extractor.find_most_similar_reference(processed_path, self.config["min_similarity"])if reference_id is None:return None, f"Failed to find similar reference: {similarity}"print(f"Found similar reference: {reference_id} with similarity: {similarity:.2f}")# 3. 生成提示词prompt, error = self.prompt_generator.generate_prompt(reference_id, None, similarity)if error:return None, f"Failed to generate prompt: {error}"print(f"Generated prompt: {prompt}")# 4. 生成多样化图像images, seeds, error = self.image_generator.generate_images(prompt, self.config["num_images"])if error:return None, f"Failed to generate images: {error}"print(f"Generated {len(images)} images")# 5. 保存生成的图像generated_paths = []for i, img in enumerate(images):img_path = os.path.join(self.config["generated_dir"], f"gen_{task_id}_{i}.jpg")img.save(img_path, quality=95)generated_paths.append(img_path)# 6. 创建六宫格grid_image, error = self.grid_compositor.create_grid(images, self.config["output_size"])if error:return None, f"Failed to create grid: {error}"# 添加标题和边框title = f"Photo Variations - {datetime.now().strftime('%Y-%m-%d %H:%M')}"final_grid, error = self.grid_compositor.add_border_and_text(grid_image, title)if error:return None, f"Failed to add border and text: {error}"# 保存六宫格grid_path = os.path.join(self.config["grids_dir"], f"grid_{task_id}.jpg")final_grid.save(grid_path, quality=95)print(f"Grid saved to: {grid_path}")# 7. 返回结果result = {"task_id": task_id,"original_image": uploaded_file_path,"processed_image": processed_path,"reference_id": reference_id,"similarity": similarity,"prompt": prompt,"generated_images": generated_paths,"grid_image": grid_path,"quality_score": quality_score,"issues": issues,"seeds": seeds}# 保存任务结果元数据metadata_path = os.path.join(self.config["grids_dir"], f"metadata_{task_id}.json")with open(metadata_path, 'w') as f:json.dump(result, f, indent=2)return result, Nonedef cleanup_task(self, task_id, keep_grid=True):"""清理任务产生的临时文件"""try:# 删除处理后的图像processed_pattern = os.path.join(self.config["processed_dir"], f"*{task_id}*")for f in glob.glob(processed_pattern):os.remove(f)# 删除生成的图像generated_pattern = os.path.join(self.config["generated_dir"], f"*{task_id}*")for f in glob.glob(generated_pattern):os.remove(f)# 删除元数据文件metadata_pattern = os.path.join(self.config["grids_dir"], f"metadata_{task_id}.json")if os.path.exists(metadata_pattern):os.remove(metadata_pattern)# 可选:删除六宫格图像if not keep_grid:grid_pattern = os.path.join(self.config["grids_dir"], f"grid_{task_id}.jpg")if os.path.exists(grid_pattern):os.remove(grid_pattern)print(f"Cleaned up files for task: {task_id}")return True, Noneexcept Exception as e:return False, f"Error during cleanup: {e}"

8.2 Web API接口

from flask_restful import Api, Resource
import uuid# 初始化Flask应用和API
app = Flask(__name__)
api = Api(app)# 全局PhotoAgent实例
photo_agent = PhotoAgent()class UploadPhoto(Resource):def post(self):# 文件上传逻辑(前面已实现)passclass ProcessPhoto(Resource):def post(self):# 获取上传的文件信息data = request.get_json()if 'filepath' not in data:return {'error': 'No filepath provided'}, 400filepath = data['filepath']# 处理照片result, error = photo_agent.process_user_photo(filepath)if error:return {'error': error}, 500return {'message': 'Processing completed', 'result': result}, 200class GetResult(Resource):def get(self, task_id):# 检查任务结果metadata_path = os.path.join(photo_agent.config["grids_dir"], f"metadata_{task_id}.json")if not os.path.exists(metadata_path):return {'error': 'Task not found'}, 404with open(metadata_path, 'r') as f:result = json.load(f)return {'result': result}, 200class DownloadGrid(Resource):def get(self, task_id):# 提供六宫格下载grid_path = os.path.join(photo_agent.config["grids_dir"], f"grid_{task_id}.jpg")if not os.path.exists(grid_path):return {'error': 'Grid not found'}, 404return send_file(grid_path, as_attachment=True)# 注册API路由
api.add_resource(UploadPhoto, '/upload')
api.add_resource(ProcessPhoto, '/process')
api.add_resource(GetResult, '/result/<string:task_id>')
api.add_resource(DownloadGrid, '/download/<string:task_id>')if __name__ == '__main__':app.run(debug=True, host='0.0.0.0', port=5000)

9. 性能优化与错误处理

9.1 性能优化策略

class PerformanceOptimizer:"""性能优化类,提供各种优化策略"""def __init__(self):self.cache = {}def enable_model_caching(self, pipeline):"""启用模型缓存以减少加载时间"""# 这里可以实现模型缓存逻辑passdef optimize_memory_usage(self):"""优化内存使用"""# 清理GPU缓存if torch.cuda.is_available():torch.cuda.empty_cache()# 清理Python垃圾收集import gcgc.collect()def batch_processing(self, tasks):"""批量处理任务以提高效率"""# 这里可以实现批量处理逻辑passdef adaptive_quality_settings(self, image_size, complexity):"""根据图像大小和复杂度自适应调整质量设置"""# 较小的图像或较低复杂度可以使用较少的推理步骤if image_size[0] * image_size[1] < 500000:  # 小于0.5MPreturn {"num_inference_steps": 20, "guidance_scale": 7.0}elif complexity == "low":return {"num_inference_steps": 22, "guidance_scale": 7.2}else:return {"num_inference_steps": 25, "guidance_scale": 7.5}

9.2 错误处理与日志记录

import logging
from logging.handlers import RotatingFileHandler
import tracebackdef setup_logging():"""设置日志记录"""# 创建日志目录os.makedirs("logs", exist_ok=True)# 配置根日志记录器logging.basicConfig(level=logging.INFO,format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',handlers=[RotatingFileHandler("logs/photo_agent.log", maxBytes=10*1024*1024, backupCount=5),logging.StreamHandler()])class ErrorHandler:"""错误处理类,提供统一的错误处理机制"""def __init__(self):self.logger = logging.getLogger(__name__)def handle_exception(self, e, context=""):"""统一处理异常"""error_msg = f"Error in {context}: {str(e)}"self.logger.error(error_msg)self.logger.error(traceback.format_exc())return error_msgdef validate_image_file(self, filepath):"""验证图像文件是否有效"""try:with Image.open(filepath) as img:img.verify()return True, Noneexcept Exception as e:return False, f"Invalid image file: {e}"def check_disk_space(self, required_mb=100):"""检查磁盘空间是否足够"""try:stat = os.statvfs('.')free_space = stat.f_bavail * stat.f_frsize / 1024 / 1024  # MBreturn free_space >= required_mb, free_spaceexcept Exception as e:return False, f"Error checking disk space: {e}"

10. 测试与验证

10.1 单元测试

import unittest
from unittest.mock import Mock, patchclass TestPhotoAgent(unittest.TestCase):"""PhotoAgent单元测试类"""def setUp(self):"""设置测试环境"""self.agent = PhotoAgent()self.test_image_path = "test_data/test_photo.jpg"def test_image_preprocessing(self):"""测试图像预处理"""result, quality, issues = self.agent.preprocessor.preprocess_image(self.test_image_path)self.assertIsNotNone(result)self.assertIsInstance(quality, (int, float))self.assertIsInstance(issues, list)@patch.object(FeatureExtractor, 'find_most_similar_reference')def test_reference_matching(self, mock_find):"""测试参考图像匹配"""# 模拟返回结果mock_find.return_value = ("ref_123", 0.85)ref_id, similarity = self.agent.feature_extractor.find_most_similar_reference(self.test_image_path)self.assertEqual(ref_id, "ref_123")self.assertAlmostEqual(similarity, 0.85, places=2)def test_prompt_generation(self):"""测试提示词生成"""prompt, error = self.agent.prompt_generator.generate_prompt("ref_123", None, 0.85)self.assertIsNotNone(prompt)self.assertIsInstance(prompt, str)self.assertGreater(len(prompt), 10)self.assertIsNone(error)@patch.object(ImageGenerator, 'generate_images')def test_image_generation(self, mock_generate):"""测试图像生成"""# 模拟返回结果mock_images = [Image.new('RGB', (100, 100)) for _ in range(6)]mock_seeds = [123, 456, 789, 101112, 131415, 161718]mock_generate.return_value = (mock_images, mock_seeds, None)images, seeds, error = self.agent.image_generator.generate_images("test prompt", 6)self.assertEqual(len(images), 6)self.assertEqual(len(seeds), 6)self.assertIsNone(error)def test_grid_creation(self):"""测试六宫格创建"""test_images = [Image.new('RGB', (200, 200)) for _ in range(6)]grid, error = self.agent.grid_compositor.create_grid(test_images)self.assertIsNotNone(grid)self.assertIsNone(error)self.assertEqual(grid.size, self.agent.grid_compositor.output_size)if __name__ == '__main__':# 创建测试数据目录os.makedirs("test_data", exist_ok=True)# 运行测试unittest.main()

10.2 集成测试

class IntegrationTest(unittest.TestCase):"""集成测试类,测试完整流程"""def setUp(self):self.agent = PhotoAgent()# 创建测试图像self.test_image = Image.new('RGB', (400, 400), color=(73, 109, 137))self.test_path = "test_data/test_integration.jpg"self.test_image.save(self.test_path)@patch.multiple(FeatureExtractor, find_most_similar_reference=Mock(return_value=("ref_123", 0.85)))@patch.multiple(ImageGenerator, generate_images=Mock(return_value=([Image.new('RGB', (200, 200)) for _ in range(6)], [123, 456, 789, 101112, 131415, 161718], None)))def test_full_process(self):"""测试完整处理流程"""result, error = self.agent.process_user_photo(self.test_path)self.assertIsNotNone(result)self.assertIsNone(error)self.assertEqual(result['reference_id'], "ref_123")self.assertAlmostEqual(result['similarity'], 0.85, places=2)self.assertEqual(len(result['generated_images']), 6)self.assertIn('grid_image', result)def tearDown(self):# 清理测试文件if os.path.exists(self.test_path):os.remove(self.test_path)# 清理可能生成的文件for f in glob.glob("processed/*test_integration*"):os.remove(f)for f in glob.glob("generated/*test_integration*"):os.remove(f)for f in glob.glob("grids/*test_integration*"):os.remove(f)

11. 部署方案

11.1 Docker容器化部署

# Dockerfile
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime# 设置工作目录
WORKDIR /app# 安装系统依赖
RUN apt-get update && apt-get install -y \libgl1 \libglib2.0-0 \&& rm -rf /var/lib/apt/lists/*# 复制requirements文件
COPY requirements.txt .# 安装Python依赖
RUN pip install --no-cache-dir -r requirements.txt# 复制应用代码
COPY . .# 创建必要的目录
RUN mkdir -p uploads processed generated grids logs# 暴露端口
EXPOSE 5000# 设置环境变量
ENV FLASK_APP=app.py
ENV FLASK_ENV=production# 启动应用
CMD ["flask", "run", "--host=0.0.0.0", "--port=5000"]

11.2 Kubernetes部署配置

# photo-agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:name: photo-agent
spec:replicas: 2selector:matchLabels:app: photo-agenttemplate:metadata:labels:app: photo-agentspec:containers:- name: photo-agentimage: photo-agent:latestports:- containerPort: 5000resources:requests:memory: "4Gi"cpu: "1000m"nvidia.com/gpu: 1limits:memory: "8Gi"cpu: "2000m"nvidia.com/gpu: 1volumeMounts:- name: storagemountPath: /app/uploads- name: storagemountPath: /app/processed- name: storagemountPath: /app/generated- name: storagemountPath: /app/gridsvolumes:- name: storagepersistentVolumeClaim:claimName: photo-agent-storage
---
apiVersion: v1
kind: Service
metadata:name: photo-agent-service
spec:selector:app: photo-agentports:- protocol: TCPport: 80targetPort: 5000type: LoadBalancer

12. 总结与展望

本文详细介绍了基于Python的智能体流程开发,实现了从用户自拍照片上传到六宫格图像生成的完整功能。系统采用模块化设计,包括图像预处理、特征提取、提示词生成、图像生成和六宫格合成等核心模块。

12.1 技术总结

  1. 图像处理:使用OpenCV和PIL库进行图像质量评估和预处理
  2. 特征提取:结合人脸识别和风格特征提取技术实现相似度计算
  3. 提示词工程:基于参考图像和相似度生成高质量的文本提示
  4. 图像生成:利用Stable Diffusion等扩散模型生成多样化图像
  5. 结果合成:将生成的图像合成为六宫格布局并提供下载

12.2 未来改进方向

  1. 模型优化:探索更高效的图像生成模型,减少计算资源需求
  2. 个性化推荐:基于用户历史偏好优化提示词生成和图像风格
  3. 实时处理:优化流程实现近实时图像生成
  4. 移动端支持:开发移动应用,提供更便捷的用户体验
  5. 多模态支持:扩展支持视频输入和3D模型生成

12.3 实际应用价值

本系统具有广泛的应用前景,包括:

  • 个人摄影爱好者创建多样化的肖像照片
  • 电子商务平台生成商品模特的多角度展示
  • 社交媒体用户创建个性化的头像集合
  • 艺术创作和设计领域的灵感激发

通过持续优化和扩展,该系统有望成为数字内容创作领域的重要工具,为用户提供高效、高质量的图像生成服务。


注意:本文提供的代码示例需要根据实际环境和需求进行调整。特别是模型加载和推理部分,需要确保有足够的硬件资源(尤其是GPU)支持。此外,参考图像数据库的构建和管理也是实际应用中需要重点考虑的部分。

http://www.dtcms.com/a/391749.html

相关文章:

  • 微服务项目->在线oj系统(Java-Spring)----3.0
  • ApplicationContext接口功能(二)
  • 多智能体强化学习(MARL)简介:从独立Q学习到MADDPG
  • 【数控系统】第八章 七段式加减速算法
  • 知识蒸馏(KD)详解三:基于BERT的知识蒸馏代码实战
  • 数字化手术室品牌厂家——珠海全视通
  • Linux 冯诺依曼体系结构与进程理解
  • Git GitHub 个人账户创建及链接本地项目教程
  • Leetcode 20
  • 第五章:离家出走
  • RabbitMQ配置项
  • 用html5写一个时区时间查询器
  • deepseek认为明天CSP-J/S初赛的重点
  • 基于Vue的场景解决
  • 浅谈 Sui 的区块链隐私解决方案
  • ETF期权交易的基础知识是什么?
  • 连接管理模块的实现
  • AI 的耳朵在哪里?—— 语音识别
  • 微博舆情大数据实战项目 Python爬虫+SnowNLP情感+Vue可视化 全栈开发 大数据项目 机器学习✅
  • Dify笔记
  • 高精度维文OCR系统:基于深度学习驱动的实现路径、技术优势与挑战
  • 使用Python+Selenium做自动化测试
  • GESP C++ 三级 2025年6月真题解析
  • Linux系统多线程的互斥问题
  • Python 之监控服务器服务
  • el-select 多选增加全部选项
  • Day24 窗口操作
  • 5. Linux 文件系统基本管理
  • 【MySQL】GROUP BY详解与优化
  • 深度学习:DenseNet 稠密连接​ -- 缓解梯度消失