当前位置：首页 > news >正文

【Python爬虫神器】requests库常用操作详解，附实战案例

news 2025/10/8 12:26:14

各位小伙伴大家好，我是唐叔。今天我们来聊聊Python里最常用的HTTP库——requests。这个库有多重要呢？可以说，只要你想用Python做网络请求，requests就是你的首选武器库！

文章目录

- 一、requests库简介
- - 1.1 什么是requests？
  - 1.2 为什么要用requests？
- 二、安装requests
- 三、requests常见操作
- - 3.1 GET请求
  - 3.2 POST请求
  - 3.3 请求头设置
  - 3.4 处理Cookie
  - 3.5 文件上传
  - 3.6 超时设置
  - 3.7 会话保持
- 四、实战案例：获取天气信息
- - 案例目标
  - 代码解析
- 五、总结

一、requests库简介

1.1 什么是requests？

requests是Python的一个第三方HTTP库，它的口号是"HTTP for Humans"（为人类设计的HTTP库）。相比Python自带的urllib库，requests的API更加简洁优雅，让发送HTTP请求变得非常简单。

1.2 为什么要用requests？

代码简洁直观
自动处理编码问题
支持连接保持和会话
内置JSON解码器
社区活跃，文档完善

二、安装requests

在开始之前，先确保你已经安装了requests库：

pip install requests

三、requests常见操作

3.1 GET请求

使用场景：获取网页内容、调用API接口

import requests

# 基本GET请求
response = requests.get('https://www.baidu.com')
print(response.text)  # 获取网页HTML内容

# 带参数的GET请求
params = {'key1': 'value1', 'key2': 'value2'}
response = requests.get('https://httpbin.org/get', params=params)
print(response.url)  # 查看实际请求的URL

3.2 POST请求

使用场景：提交表单数据、上传文件、调用需要传递参数的API

# 基本POST请求
data = {'key': 'value'}
response = requests.post('https://httpbin.org/post', data=data)
print(response.json())  # 将响应解析为JSON

# 发送JSON数据
import json
data = {'key': 'value'}
response = requests.post('https://httpbin.org/post', json=data)

3.3 请求头设置

使用场景：模拟浏览器访问、设置认证信息

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    'Accept': 'application/json'
}
response = requests.get('https://httpbin.org/headers', headers=headers)
print(response.json())

3.4 处理Cookie

使用场景：保持登录状态、跟踪会话

# 获取Cookie
response = requests.get('https://www.example.com')
print(response.cookies)

# 发送Cookie
cookies = dict(key1='value1')
response = requests.get('https://httpbin.org/cookies', cookies=cookies)

3.5 文件上传

使用场景：上传图片、文档等文件

files = {'file': open('test.jpg', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)
print(response.text)

3.6 超时设置

使用场景：避免请求长时间无响应

try:
    response = requests.get('https://www.example.com', timeout=3)  # 3秒超时
except requests.exceptions.Timeout:
    print("请求超时！")

3.7 会话保持

使用场景：需要保持登录状态的多请求操作

s = requests.Session()
s.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
response = s.get('https://httpbin.org/cookies')
print(response.text)  # 会显示之前设置的cookie

四、实战案例：获取天气信息

下面我们通过一个简单的实战案例，来综合运用requests的各种功能。

案例目标

通过高德天气API，获取指定城市的实时天气信息

import requests

def get_weather(city_name):
    # 1. 准备参数
    api_key = '你的高德API Key'  # 需要先去高德开放平台申请
    base_url = 'https://restapi.amap.com/v3/weather/weatherInfo'

    params = {
        'key': api_key,
        'city': city_name,
        'extensions': 'base',  # 获取实时天气
        'output': 'JSON'
    }

    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'
    }

    try:
        # 2. 发送请求
        response = requests.get(base_url, params=params, headers=headers, timeout=5)
        response.raise_for_status()  # 检查请求是否成功

        # 3. 解析响应
        data = response.json()

        if data['status'] == '1' and data['infocode'] == '10000':
            weather_info = data['lives'][0]
            print(f"城市: {weather_info['city']}")
            print(f"天气: {weather_info['weather']}")
            print(f"温度: {weather_info['temperature']}℃")
            print(f"风向: {weather_info['winddirection']}")
            print(f"风力: {weather_info['windpower']}级")
        else:
            print("获取天气信息失败:", data['info'])

    except requests.exceptions.RequestException as e:
        print("请求出错:", e)

# 使用示例
get_weather('北京')