pandas 数据透视表
数据的透视表
-
数据的透视表:
-
使用函数 pivot_table( )
# 引用pandas import pandas as pd # pivot_table 使用 pd.pivot_table(data,values,index,aggfunc,fill_value,columns) 参数1:data = DataFrame的源数据 参数2:values = '要进行聚合操作的列' 参数3:index = '进行分组的数据列' 最终显示到透视表的行 参数4:aggfunc = '要进行的聚合操作' 默认值为求平均值 'mean' 参数5:fill_value = 空值替换的数据 如果结果有空值,则指定要填充的字段 参数6:columns = '列索引'按列索引进行聚合操作
-
-
例如:
import pandas as pd data = { "班级":["一班","一班","二班","二班","三班","三班"], "学生":["小红","小红","小强","小李","小王","小赵"], "科目":["语文","数学","语文","数学","语文","数学"], "成绩":[89,78,88,92,93,85] } df = pd.DataFrame(data) # 按班级 汇总平均成绩 => 按班级分组,得到每个班级的平均成绩 df1 = pd.pivot_table(data = df,values = '成绩',index = '班级',aggfunc = 'mean') print(df1)
结果
成绩 班级 一班 83.5 三班 89.0 二班 90.0
使用
columns
案例import pandas as pd data = { "班级":["一班","一班","二班","二班","三班","三班"], "学生":["小红","小红","小强","小李","小王","小赵"], "科目":["语文","数学","语文","数学","语文","数学"], "成绩":[89,78,88,92,93,85] } df = pd.DataFrame(data) # 求每个班的科目平均成绩 df2 = pd.pivot_table(data = df,values = '成绩',index = '班级',columns = '科目',) print(df2)
结果:
科目 数学 语文 班级 一班 78.0 89.0 三班 85.0 93.0 二班 92.0 88.0
-
案例2:
import pandas as pd data = { "产品":["A","B","A","B","A","B"], "区域":["东区","东区","西区","西区","东区","西区"], "销售额":[200,160,300,234,450,321], "数量":[19,18,15,12,13,10] } df = pd.DataFrame(data) print("源数据:\n",df) # 创建透视表,按区域分组,计算每个区域的销售总额和数量 df1 = pd.pivot_table(df,values = ['销售额','数量'],index = '区域',aggfunc = {'销售额':'sum','数量':'sum'}) print(df1)
结果:
源数据: 产品 区域 销售额 数量 0 A 东区 200 19 1 B 东区 160 18 2 A 西区 300 15 3 B 西区 234 12 4 A 东区 450 13 5 B 西区 321 10 数量 销售额 区域 东区 50 810 西区 37 855