DAY8字典的简单介绍
字典的简单介绍
字典就是键值对
标签编码
实现映射操作
import pandas as pd
data = pd.read_csv('data.csv')
data
data["Home Ownership"].value_counts()
# 定义映射字典
mapping = {"Own Home": 1,"Rent": 0,"Have Mortgage": 2,"Home Mortgage": 3}
data["Home Ownership"] = data["Home Ownership"].map(mapping)
data["Home Ownership"].head()
也可以一个函数实现两个映射
import pandas as pd# 重新读取数据
data = pd.read_csv("data\data.csv")
# 嵌套映射字典
mapping = {"Term": {"Short Term": 1,"Long Term": 0},"Home Ownership": {"Rent": 0,"Own Home": 1,"Have Mortgage ": 2,"Home Mortgage": 3}
}
连续变量的处理
归一化和标准化,直接sklearn中的归一化和标准化函数。
# 借助sklearn库进行归一化处理from sklearn.preprocessing import StandardScaler, MinMaxScaler
data = pd.read_csv("data\data.csv")# 重新读取数据# 归一化处理
min_max_scaler = MinMaxScaler() # 实例化 MinMaxScaler类,之前课上也说了如果采取这种导入函数的方式,不需要申明库名
data['Annual Income'] = min_max_scaler.fit_transform(data[['Annual Income']])data['Annual Income'].head()
# 标准化处理
data = pd.read_csv("data\data.csv")# 重新读取数据
scaler = StandardScaler() # 实例化 StandardScaler,
data['Annual Income'] = scaler.fit_transform(data[['Annual Income']])
data['Annual Income'].head()
@浙大疏锦行