全面指南:HTTPX - 下一代Python HTTP客户端
什么是HTTPX?
HTTPX 是一个功能齐全的Python HTTP客户端库,支持同步和异步API,基于标准库的 http
包构建。它是 requests
库的现代替代品,添加了对HTTP/2、连接池和异步请求等功能的原生支持。
主要特性
- 同步和异步API:支持
requests
风格的同步API和asyncio
驱动的异步API - HTTP/2支持:通过
http2=True
参数启用 - 类型注解:完整的类型提示,提升开发体验
- 超时处理:细粒度的超时控制
- 连接池:自动管理连接池,提高性能
- WebSocket支持:内置WebSocket客户端
- 代理支持:HTTP/HTTPS/SOCKS代理
- 文件上传:支持多部分文件上传
安装
pip install httpx
如果需要HTTP/2支持:
pip install 'httpx[http2]'
基础用法
同步请求
import httpx# GET请求
response = httpx.get('https://httpbin.org/get')
print(response.status_code)
print(response.json())# 带参数的GET
params = {'key1': 'value1', 'key2': 'value2'}
response = httpx.get('https://httpbin.org/get', params=params)# POST请求
data = {'key': 'value'}
response = httpx.post('https://httpbin.org/post', json=data)
异步请求
import asyncio
import httpxasync def fetch_data():async with httpx.AsyncClient() as client:response = await client.get('https://httpbin.org/get')return response.json()# 运行异步函数
result = asyncio.run(fetch_data())
print(result)
高级特性
1. 客户端实例
使用 Client
或 AsyncClient
可以复用连接,提高性能:
# 同步
with httpx.Client() as client:response = client.get('https://httpbin.org/get')# 异步
async with httpx.AsyncClient() as client:response = await client.get('https://httpbin.org/get')
2. 超时设置
# 全局超时
timeout = httpx.Timeout(10.0, connect=5.0) # 10秒总超时,5秒连接超时# 每个请求单独设置
response = httpx.get('https://httpbin.org/delay/5', timeout=timeout)# 或者直接设置秒数
response = httpx.get('https://httpbin.org/delay/5', timeout=10.0)
3. 自定义请求头
headers = {'User-Agent': 'MyApp/1.0','Authorization': 'Bearer token123'
}
response = httpx.get('https://httpbin.org/headers', headers=headers)
4. 处理响应
response = httpx.get('https://httpbin.org/get')# 状态码
print(response.status_code)# 响应头
print(response.headers)
# 响应体文本
print(response.text)
# JSON响应
print(response.json())
# 二进制响应
print(response.content)
# 流式响应
with httpx.stream("GET", "https://www.example.com") as response:for chunk in response.iter_bytes():print(chunk)
5. 文件上传
# 单个文件
files = {'file': open('report.xls', 'rb')}
response = httpx.post('https://httpbin.org/post', files=files)# 多个文件
files = [('images', ('image1.jpg', open('image1.jpg', 'rb'), 'image/jpeg')),('images', ('image2.jpg', open('image2.jpg', 'rb'), 'image/jpeg'))
]
response = httpx.post('https://httpbin.org/post', files=files)
6. 代理支持
proxies = {"http://": "http://localhost:8030","https://": "http://localhost:8031",
}response = httpx.get("http://example.com", proxies=proxies)
7. 重试机制
from httpx import HTTPStatusError, TimeoutExceptiondef log_retry(request, exc):print(f"Request {request.url} failed: {exc}")client = httpx.Client(event_hooks={"request": [log_retry],},transport=httpx.HTTPTransport(retries=3) # 重试3次
)try:response = client.get("https://example.com")response.raise_for_status()
except (HTTPStatusError, TimeoutException) as exc:print(f"Request failed after retries: {exc}")
与Requests的主要区别
- 异步支持:
httpx
原生支持async/await
- HTTP/2:
httpx
支持 HTTP/2,而requests
不支持 - 类型提示:
httpx
有完整的类型注解 - 性能:
httpx
在某些情况下可能更高效,特别是在处理大量并发请求时 - API兼容性:
httpx
的API设计类似于requests
,但不完全相同
最佳实践
-
使用上下文管理器:确保资源被正确释放
with httpx.Client() as client:response = client.get('https://example.com')
-
复用客户端:为多个请求创建单个客户端实例
# 不好 def get_data():return httpx.get('https://api.example.com/data')# 好 client = httpx.Client() def get_data():return client.get('https://api.example.com/data')
-
处理异常:
try:response = httpx.get('https://example.com')response.raise_for_status() except httpx.HTTPStatusError as exc:print(f"Error response {exc.response.status_code}") except httpx.RequestError as exc:print(f"Request failed: {exc}")
-
设置超时:总是设置合理的超时
response = httpx.get('https://example.com', timeout=10.0)
-
使用连接池:默认启用,但可以调整大小
limits = httpx.Limits(max_keepalive_connections=5, max_connections=10) client = httpx.Client(limits=limits)
性能考虑
- 连接池:
httpx
默认使用连接池,提高性能 - HTTP/2:对于多个请求到同一主机,HTTP/2 可以显著提高性能
- 流式响应:对于大文件,使用流式响应减少内存使用
with httpx.stream("GET", "https://example.com/large_file") as response:with open("large_file", "wb") as f:for chunk in response.iter_bytes():f.write(chunk)
常见问题
1. 如何禁用SSL验证?
client = httpx.Client(verify=False) # 不推荐用于生产环境
2. 如何设置基本认证?
auth = ('username', 'password')
response = httpx.get('https://example.com', auth=auth)
3. 如何发送表单数据?
data = {'key1': 'value1', 'key2': 'value2'}
response = httpx.post('https://httpbin.org/post', data=data)
4. 如何处理cookies?
# 发送cookies
cookies = {"session_id": "abc123"}
response = httpx.get('https://example.com', cookies=cookies)# 获取cookies
print(response.cookies['session_id'])
总结
HTTPX 是一个功能强大、现代化的HTTP客户端,适合从简单到复杂的HTTP请求场景。它结合了 requests
的易用性和现代Python特性,是构建HTTP客户端的绝佳选择。
对于新项目,特别是需要异步支持或HTTP/2的项目,强烈建议考虑使用HTTPX而不是较老的 requests
库。