当前位置：首页 > news >正文

人工智能训练师备考——2.1.5题解

news 2025/11/15 18:24:09

有句话说的好，书读百遍其意自现，这个考证备考也是一样的其实，多刷几遍题型，多写几遍，基本上就八九不离十了，然后多记一下命令，基本上就没太大问题，加油啊同行者

代码题

1. 加载数据集

题目：data = __________

加载数据就是使用pd.read_csv函数（最基础的题型，一定要拿分）

然后从题目中找到需要的文件名健康咨询客户数据集

所以最后填入：data = pd.read_csv('健康咨询客户数据集.csv')

2.查看表结构基本信息

题目：print(__________)

查看表结构基础信息联想data.info()函数，这个是查看表结构的

上面的数据源的命名为data所以不需要改数据源的名称

最后输入：print(data.info())

3.显示每一列的空缺值数量

题目：print(__________)

首先这个问题有两个方面，一个是找出空缺值，第二个需要统计他们的数量

找出空缺值使用isnull()函数，统计数量有两个一个是.sum()第二个是.value_counts()

两个的区别是sum是统计数值的总和，value_counts是统计对应的字段或者符合要求的数据出现的次数进行计数

这里基本上是固定搭配直接写入data.isnull().sum()

最后填入：print(data.isnull().sum())

4.删除含有缺失值的行

题目：data_cleaned = __________

删除应该是drop或者或者dropna一般如果需要专门限制删除某列会去使用drop和columns=‘列名’的组合，dropna即为删除空值，空行，

所以这里直接填入dropna（）即可

最后填入：data_cleaned = dropna（）

5.转换 'Your age' 列的数据类型为整数类型，并处理异常值

题目：data_cleaned.loc[:, 'Your age'] = __________(__________, errors='coerce')
data_cleaned = data_cleaned.dropna(subset=['Your age'])
data_cleaned = data_cleaned[data_cleaned['Your age'] >= 0]
data_cleaned.loc[:, 'Your age'] = data_cleaned['Your age'].__________

这里会有两个操作，1.先将Your age转换成数据类型2.将数据类型转换成整数类型

转换成数据类型需要使用pd.to_numeric() 需要转换的数据为data_cleaned【'Your age'】
转换为整数类型为需要使用astype函数

所以最后答案为：data_cleaned.loc[:, 'Your age'] = pd.to_numeric(data_cleaned['Your age'], errors='coerce')
data_cleaned = data_cleaned.dropna(subset=['Your age'])
data_cleaned = data_cleaned[data_cleaned['Your age'] >= 0]
data_cleaned.loc[:, 'Your age'] = data_cleaned['Your age'].astype(int)

6.检查和删除重复值

题目：duplicates_removed = data_cleaned.duplicated().sum()
data_cleaned = __________

删除重复值为drop_duplicates()

所以直接填入数据源.drop_duplicates()

对应的数据源为从上文可知为：data_cleaned

所以最后填入：data_cleaned = data_cleaned.drop_duplicates()

7.归一化 'How do you describe your current level of fitness ?' 列

题目：label_encoder = LabelEncoder()
data_cleaned[__________] = __________

看到归一化，直接想到fit_transform,
然后从题目找到需要归一化的数据为 'How do you describe your current level of fitness ?' 列，
数据源为：label_encoder

所以填入：data_cleaned['How do you describe your current level of fitness ?'] = label_encoder.fit_transform(data_cleaned['How do you describe your current level of fitness ?'])

8.绘制饼图

题目：plt.figure(figsize=(10, 6))
__________(autopct='%1.1f%%', startangle=90, colors=plt.cm.Paired.colors)
plt.title('Distribution of Exercise Frequency')
plt.ylabel('')
plt.show()

饼图函数为plot.pie直接背
对应的数据从上文exercise_frequency_counts = data_cleaned['How often do you exercise?'].value_counts()可以找到为exercise_frequency_counts

所以填入：exercise_frequency_counts.plot.pie(autopct='%1.1f%%', startangle=90, colors=plt.cm.Paired.colors)