LLM相关代码笔记
10,DPO起始loss都是0.7,kto起始loss都是0.5
dpo_loss = -F.logsigmoid(self.args.dpo_beta * (pi_logratios - ref_logratios))
kto_loss = 1 - F.sigmoid(self.args.kto_beta * (chosen_logratios - KL))
- 因为刚开始ref_model=model,所以
- d p o _ l o s s = − log σ ( β ∗ 0 ) = l o g 2 = 0.6931 dpo\_loss=-\log\sigma(\beta*0)=log2=0.6931 dpo_loss=−logσ(β∗0)=log2=0.6931
- k t o _ l o s s = 1 − σ ( b e t a ∗ 0 ) = 0.5 kto\_loss=1-\sigma(beta*0)=0.5