当前位置：首页 > news >正文

Python Requests 库终极指南

news 2025/7/13 15:59:38

一、发展历程

起源
Requests 库由 Kenneth Reitz 于 2011 年创建，目标是替代 Python 标准库中的 urllib2，提供更简洁、人性化的 HTTP 客户端接口。其设计哲学是 “HTTP for Humans”。
关键版本
- 2011: v0.2.0 发布，支持 GET/POST 方法。
- 2012: v1.0.0 正式发布，新增 Session 对象、流式下载等功能。
- 2014: v2.0.0 支持 Python 3，底层依赖 urllib3 库。
- 2022: 移交至 Python Software Foundation 维护。

二、基础方法与核心功能

import requests

# 1. GET 请求
response = requests.get(
    url="http://httpbin.org/get",
    params={
   "key": "value"},  # 查询参数
    headers={
   "User-Agent": "demo"},  # 请求头
    timeout=5  # 超时设置
)
print(response.text)  # 响应文本

# 2. POST 请求（表单数据）
response = requests.post(
    url="http://httpbin.org/post",
    data={
   "name": "Alice"}  # 表单数据
)

# 3. POST 请求（JSON 数据）
response = requests.post(
    url="http://httpbin.org/post",
    json={
   "name": "Bob"}  # 自动设置 Content-Type 为 application/json
)

# 4. 文件上传
files = {
   "file": open("test.txt", "rb")}
response = requests.post("http://httpbin.org/post", files=files)

# 5. 处理响应
print(response.status_code)  # HTTP 状态码
print(response.headers)      # 响应头
print(response.json())       # 解析 JSON 数据

三、工作原理与实现机制

底层依赖
Requests 基于 urllib3 实现，后者提供连接池、SSL/TLS 验证、重试机制等底层功能。

Session 对象
使用 Session 复用 TCP 连接，提升性能：

with requests.Session() as session:
    session.get("http://httpbin.org/cookies/set/sessioncookie/123")
    response = session.get("http://httpbin.org/cookies")  # 自动携带 Cookie

请求处理流程
- 构造 PreparedRequest 对象
- 通过 adapters 发送请求
- 返回 Response 对象，封装状态码、头、内容等。

四、应用领域与代码示例

1. 网页抓取（含异常处理）

import requests
from bs4 import BeautifulSoup

try:
    response = requests.get(
        "https://example.com",
        headers={
   "User-Agent": "Mozilla/5.0"},  # 模拟浏览器
        timeout=10
    )
    response.raise_for_status()  # 检查 HTTP 错误
    
    soup = BeautifulSoup(response.text, "html.parser")
    print(soup.title.text)
except requests.exceptions.RequestException as e:
    print(f"请求失败: {
     e}")

2. REST API 交互

import requests

# 获取 GitHub 用户信息
url = "https://api.github.com/users/octocat"
response = requests.get(url)
if response.status_code == 200:
    data = response.json()
    print(f"用户名: {
     data['login']}, 仓库数: {
     data['public_repos']}")
else:
    print(f"错误: {
     response.status_code}")

3. 大文件流式下载

url = "https://example.com/large_file.zip"
response = requests.get(url, stream=True)  # 流式模式

with open("large_file.zip", "wb") as f:
    for chunk in response.iter_content(chunk_size=8192):
        if chunk:  # 过滤 keep-alive 数据块
            f.write(chunk)

4. OAuth2 认证

# 使用 OAuth2 获取访问令牌
auth_url = "https://oauth.example.com/token"
data = {
   
    "grant_type": "client_credentials",
    "client_id": "YOUR_CLIENT_ID",
    "client_secret": "YOUR_SECRET"
}
response = requests.post(auth_url, data=data)
access_token = response.json()["access_token"]

# 携带令牌访问 API
headers = {
   "Authorization": f"Bearer {
     access_token}"}
response = requests.get("https://api.example.com/data", headers=headers)

5. 代理设置

proxies = {
   
    "http": "http://10.10.1.10:3128",
    "https": "http://10.10.1.10:1080"
}
requests.get("http://example.org", proxies=proxies)

五、高级特性

自定义适配器
控制连接池大小：

from requests.adapters import HTTPAdapter

session = requests.Session()
adapter = HTTPAdapter(pool_connections=10, pool_maxsize=100)
session.mount("https://", adapter)

请求钩子
在请求前后插入逻辑：

def print_url(response, *args, **kwargs):
    print(f"请求URL: {
       response.url}")

requests.get("http://httpbin.org", hooks={
     "response": print_url})

SSL 验证禁用
（仅限测试环境！）

requests.get("https://example.com", verify=False)

六、应用领域

优势：简洁 API、丰富的功能、完善的文档。
适用场景：爬虫开发、API 测试、微服务通信等。
替代方案：httpx（支持异步）、aiohttp（异步框架）。

七、深入实现机制

1. 连接池管理

Requests 通过 urllib3 实现高效的连接池管理，减少重复建立连接的开销。每个 Session 对象维护独立的连接池。

from requests.adapters import HTTPAdapter

session = requests.Session()
adapter = HTTPAdapter(pool_connections=5, pool_maxsize=20)

查看全文

http://www.dtcms.com/a/111855.html

Redis-13.在Java中操作Redis-Spring Data Redis使用方式-操作哈希类型的数据

免费内网穿透方法

LocaDate、LocalTime、LocalDateTime

如何设计好一张表

LLM 性能优化有哪些手段？

软件工程面试题（二十七）

硬件电路(23)-输入隔离高低电平有效切换电路

MYOJ_4342:(洛谷P1087)[NOIP 2004 普及组] FBI 树(二叉树实操，递归提高)

SQL Server数据库异常-[SqlException (0x80131904): 执行超时已过期] 操作超时问题及数据库日志已满的解决方案

Arduino示例代码讲解：Ping

c语言学习16——内存函数

面向对象(2)

多模态技术概述（一）

Visio | 将(.vsdx)导出为更清楚/高质量的图片(.png) | 在Word里面的visio图

从感光sensor到显示屏，SOC并非唯一多样

手动将ModelScope的模型下载到本地

Eclipse怎么创建java项目

前端快速入门学习2-HTML

编写实现一个简易的域名服务器

长龙通信机CAN数据查看（工具版）

AI Agent设计模式一：Chain

出现次数最多的子树元素和——深度优先搜索

如何将Android 应用上传到国内各大应用市场

Webpack中loader的作用。

【AI4CODE】5 Trae 锤一个基于百度Amis的Crud应用

AI+OCR：解锁数字化新视界

33、web前端开发之JavaScript(二)

KingbaseES之KDts迁移Mysql

【11408学习记录】英语写作黄金模板+语法全解：用FTC数据泄漏案掌握书信结构与长难句拆解（附思维导图）

【统计方法】LASSO筛变量