广州企业网站哪家好广告精准推广平台
360图片搜索功能爬虫,根据搜索内容批量获取图片,很简单就能实现
结尾附实现代码
一、抓包与分析
直接在360搜索处点击回车即可抓到
GET请求包 https://image.so.com/i?q=天空&inact=0
q——搜索框输入的内容
搜索包响应结果 1
搜索包响应结果展示 2
在响应中,这一部分的的数据便是图片的下载地址和分辨率
可以看到搜索结果的相关数据
title——unicode编码后的标题
img——图片链接
width、hight——分辨率
二、实现代码
代码运行效果1
代码运行效果2
import requests
import json
import os
from bs4 import BeautifulSoupdef fetch_image_urls(url, headers):response = requests.get(url, headers=headers)if response.status_code != 200:print(f"请求失败,状态码:{response.status_code}")return []soup = BeautifulSoup(response.text, "html.parser")script_tag = soup.find("script", {"type": "text/data", "id": "commercialImages"})if not script_tag:print("未找到标签")return []try:image_data = json.loads(script_tag.string)return [img["qhimg_url"] for img in image_data]except json.JSONDecodeError:print("解析JSON数据失败")return []def download_images(image_urls, headers, folder="data"):if not os.path.exists(folder):os.makedirs(folder)for i, url in enumerate(image_urls):try:img_response = requests.get(url, headers=headers)if img_response.status_code == 200:with open(f"{folder}/image_{i}.jpg", "wb") as f:f.write(img_response.content)print(f"下载成功: image_{i}.jpg")else:print(f"下载失败: {url}")except Exception as e:print(f"下载错误: {e}")if __name__ == "__main__":url = "https://image.so.com/i?q=海洋&inact=0"headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36"}image_urls = fetch_image_urls(url, headers)if image_urls:print("图片下载链接:")for url in image_urls:print(url)download_images(image_urls, headers)