当前位置：首页 > news >正文

Python中*args与**kwargs用法解析

news 2025/10/7 5:18:31

在 Python 中，*args 和 **kwargs 是函数参数的特殊语法，用于处理「可变数量的参数」—— 解决 “函数参数数量不确定” 的问题，让函数更灵活。两者的核心区别在于：*args 处理「位置参数」，**kwargs 处理「关键字参数」，且在爬虫、框架开发等场景中高频使用。

一、*args：收集「位置参数」为元组

1. 核心定义与作用

语法：在函数参数中写 *args（args 是约定俗成的名称，可替换为 *params、*urls 等，关键是前面的 *）。
作用：将函数调用时传入的「所有未明确定义的位置参数」，打包成一个 元组（tuple），供函数内部遍历或处理。
本质：解决 “函数需要接收任意个位置参数” 的需求（如接收多个 URL、多个数值）。

2. 基础用法（结合爬虫场景）

以 “批量爬取多个 URL” 为例，用 *args 接收任意个 URL 地址：

import requestsdef batch_crawl(*args):"""批量爬取多个URL，*args接收所有位置参数（URL）"""print(f"待爬取URL总数：{len(args)}")  # args是元组，可通过len()获取数量for url in args:  # 遍历元组，逐个爬取try:response = requests.get(url=url,headers={"User-Agent": "Mozilla/5.0..."},timeout=5)print(f"爬取 {url} 成功，状态码：{response.status_code}")except Exception as e:print(f"爬取 {url} 失败：{str(e)}")# 调用函数：传入3个URL（位置参数，数量可任意）
batch_crawl("https://httpbin.org/get","https://httpbin.org/ip","https://httpbin.org/user-agent"
)

函数内部 args 的值：("https://httpbin.org/get", "https://httpbin.org/ip", "https://httpbin.org/user-agent")（元组类型）。
若调用时不传位置参数：batch_crawl()，则 args 是空元组 ()，不会报错。

3. 进阶：调用时的「解包」操作

* 不仅能在函数定义时 “收集参数”，还能在函数调用时 “解包参数”—— 将列表、元组等「可迭代对象」拆成单个位置参数，传递给函数。

爬虫场景示例（批量传递 URL 列表）：

# 准备一个URL列表（爬虫中常见的“待爬队列”）
url_list = ["https://httpbin.org/get?page=1","https://httpbin.org/get?page=2","https://httpbin.org/get?page=3"
]# 用*解包列表：将列表拆成3个位置参数，传给batch_crawl
batch_crawl(*url_list)  # 等价于 batch_crawl(url_list[0], url_list[1], url_list[2])

二、**kwargs：收集「关键字参数」为字典

1. 核心定义与作用

语法：在函数参数中写 **kwargs（kwargs 是约定俗成的名称，可替换为 **config、**options 等，关键是前面的 **）。
作用：将函数调用时传入的「所有未明确定义的关键字参数」，打包成一个 字典（dict），供函数内部通过 “键值对” 获取参数。
本质：解决 “函数需要接收任意个关键字参数” 的需求（如接收爬虫的请求头、超时、代理等配置）。

2. 基础用法（结合爬虫场景）

以 “封装爬虫请求函数” 为例，用 **kwargs 接收任意请求配置（如 headers、proxies、timeout）：

import requestsdef spider_request(url, method="GET",** kwargs):"""封装请求函数，**kwargs接收任意关键字参数（请求配置）"""try:# 打印接收的配置（kwargs是字典，键是参数名，值是参数值）print(f"请求配置：{kwargs}")response = requests.request(method=method.upper(),url=url,**kwargs  # 将kwargs解包，传递给requests.request（关键！）)response.raise_for_status()  # 检查状态码（非200抛异常）return responseexcept Exception as e:print(f"请求失败：{str(e)}")return None# 调用函数：传入URL + 关键字参数（headers、timeout、proxies）
response = spider_request(url="https://httpbin.org/get",method="GET",headers={"User-Agent": "Mozilla/5.0..."},  # 关键字参数1timeout=10,  # 关键字参数2proxies={"http": "http://127.0.0.1:8888"}  # 关键字参数3
)if response:print("响应内容：", response.json())

函数内部kwargs的值：

{"headers": {"User-Agent": "Mozilla/5.0..."}, "timeout": 10, "proxies": {"http": "http://127.0.0.1:8888"}}

（字典类型）。

requests.request(**kwargs)：这里的 ** 是 “解包字典”—— 将 kwargs 的键值对拆成 headers=xxx、timeout=xxx 等关键字参数，传递给底层请求函数。

3. 进阶：调用时的「解包」操作

** 也能在函数调用时 “解包字典”—— 将字典的键值对拆成「关键字参数」，传递给函数（爬虫中常用于复用配置）。

爬虫场景示例（复用请求头配置）：

# 定义通用请求头（爬虫中可复用的配置）
common_headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/129.0.0.0 Safari/537.36","Accept": "text/html,application/xhtml+xml,*/*","Referer": "https://example.com/"
}# 用**解包common_headers字典，作为关键字参数传给spider_request
response = spider_request(url="https://httpbin.org/get",**common_headers,  # 等价于 headers=common_headers（但更灵活）timeout=8
)

三、*args 与 **kwargs 的组合使用

在实际开发中（尤其是框架或工具函数），常将 *args 和 **kwargs 结合使用，同时支持 “任意位置参数” 和 “任意关键字参数”。** 注意参数顺序：必须是「固定位置参数 → *args → 固定关键字参数 → kwargs」，否则会报错。

爬虫场景示例（通用批量请求函数）

import requestsdef batch_request(method, *args,** kwargs):"""通用批量请求函数- method：固定位置参数（请求方法，如GET/POST）- *args：任意位置参数（待爬URL列表）- **kwargs：任意关键字参数（请求配置，如headers、timeout）"""results = []for url in args:try:response = requests.request(method=method.upper(),url=url,**kwargs)results.append({"url": url, "status": response.status_code})except Exception as e:results.append({"url": url, "status": "failed", "error": str(e)})return results# 调用：method（固定参数） + 3个URL（*args） + 2个配置（**kwargs）
result = batch_request("GET",  # method（固定位置参数）"https://httpbin.org/get?page=1",  # args[0]"https://httpbin.org/get?page=2",  # args[1]"https://httpbin.org/get?page=3",  # args[2]headers=common_headers,  # kwargs键1timeout=5  # kwargs键2
)print("批量请求结果：", result)

四、核心区别与使用场景对比

特性	*args	**kwargs
处理参数类型	位置参数（如 `a, b, c`）	关键字参数（如 `key1=val1, key2=val2`）
收集结果类型	元组（tuple）	字典（dict）
调用时解包对象	列表、元组、集合等可迭代对象	字典（必须是键值对结构）
爬虫核心用途	批量传递 URL、批量传递页面 ID 等	传递请求配置（headers、proxies、timeout）、传递解析规则等
常见函数示例	`batch_crawl(*urls)`	`make_request(url, **config)`