Machine Learning HW1 report(Hongyi Lee)
kaggle这个任务score越低越好,因为表示的是与真实值的差距。优化思路:选择更好的features,改神经网络架构,L2正则化。
要求
尝试
1.根据常识挑选feature
1到38表示各州,挑选前四天的感染率以及戴口罩的数据
feat_idx = list(range(1,38))+[42,53,58,69,74,85,90,101,106]
结果:best test loss,Train loss: 1.2407, Valid loss: 0.8372,得分1.17068
2.利用pandas计算相关系数挑选feature
Feature Selection
df=pd.read_csv("/content/covid.test.csv")df.head() #show the first five lines to see if it's corrrectly readfeatures=df.drop(columns=['tested_positive']) #get all features except target 'tested_positive'corr_with_target = features.corrwith(df['tested_positive'],method='spearman').sort_values(ascending=False)#compute the correlation index of features with target, stored in descending orderstrong_corr = corr_with_target[abs(corr_with_target)>0.8] #get features has strong correlation with targetprint(strong_corr)
结果
调整参数
feat_idx = list(range(1,38))+[38,39,40,41,53,54,55,56,57, 69,70,71,72,73, 85,86,87,88,89, 101,102,103,104,105]
结果:Epoch [1485/3000]: Train loss: 1.1094, Valid loss: 1.0514,得分0.93286
3.使用L2正则化
尝试后发现weight_decay为0.01,0.001,0.002, 0.0005时,效果甚至不如不使用L2正则化。0.0001时,比不使用L2正则化有细微的进步,得分0.93275。调整为0.00005后效果又不如不使用L2正则化。
optimizer = torch.optim.SGD(model.parameters(), lr=config['learning_rate'], momentum=0.9,weight_decay=0.01)
4.使用Sigmoid代替ReLU
发现Sigmoid明显比ReLU收敛地更慢,且在此情景下结果极差。这是一次失败的尝试
5.增加神经网络层数
失败的尝试。把隐藏层从一个增加到了两个,private score(最终得分)为0.94437,但public score却很高:0.87614 ;将weight_decay调整为0.001后有细微进步,private score:0.94332,public score:0.89386
self.layers = nn.Sequential(nn.Linear(input_dim, 64),nn.ReLU(),nn.Linear(64, 16),nn.ReLU(),nn.Linear(16, 8),nn.ReLU(),nn.Linear(8, 1))
总结
最好的结果为使用相关系数大于0.8的feature,L2正则化weight_decay为0.0001时取得,得分0.93275。
改变神经网络结构的尝试不多,全部失败,但理论上更合理的神经结构肯定取得更好的成果。此外,发现Sigmoid用于regression时效果很差。