当前位置：首页 > news >正文

网络请求requests模块（爬虫）-15

news 2025/7/13 15:44:40

文章目录

1.requests
2.基本get请求（headers参数和parmas参数）
- 2.1 最基本的get请求去可以直接用get方法
- 2.2 get请求中添加headers
- 2.3 get请求中添加parmas
3.基本POST请求（data参数）
4.代理（proxies参数）
5.私密代理验证（特定格式）和Web客户端验证（auth参数）
6.Cookies 和 Session
- 6.1 Cookies
- 6.2 Session
7.处理HTTPS请求（SSL证书验证）

1.requests

Requests 继承了urllib2 的所有特性。Requests支持HTTP连接保持和连接池，支持使用 cookie 保持会话，支持文件上传，支持自动确定响应内容的编码，支持国际化的URL和POST数据自动编码。

Requests 的底层实现其实就是 urllib3

Requests 的文档非常完善。Requests能够完全满足当前网络的需求，支持Python2-3，而且能在Pypy下完美运行。

https://requests.readthedocs.io/en/latest/index.html

安装Requests

pip install requests

2.基本get请求（headers参数和parmas参数）

2.1 最基本的get请求去可以直接用get方法

import requests

# 最基本的get请求
url = 'http://www.baidu.com'

response = requests.get(url)

# 也可以这样写
response = requests.request("GET", url)

2.2 get请求中添加headers

如果想添加 headers，可以传入 headers 参数来增加请求头中的 headers信息。

import requests

# 最基本的get请求
url = 'http://www.douban.com'

# 也可以这样写
# response = requests.request("GET", url)

headers = {
   
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36'
}
response = requests.get(url, headers=headers)
# 查看响应内容， response.text 返回的是Unicode格式的数据
print(response.text)

# 查看你响应内容，response.content 返回的是字节流数据
print(response.content)

# 查看完整URL地址
print