-
Notifications
You must be signed in to change notification settings - Fork 497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于 MoE 版本的辅助损失函数. #110
Comments
是的,简单起见这部分loss并没有加入训练😊 |
谢谢!如果需要加入训练,其实现是不是应该将每层的 loss 都存下来和最后的 ce loss 一起进行梯度计算呢? |
是的,只需要把每一层的 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
您好,这里的 aux_loss 看起来并没有被使用?还是通过其他的方式参与了训练呢?
The text was updated successfully, but these errors were encountered: