-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project ZeroTangent to natural tangent for some number types #574
base: main
Are you sure you want to change the base?
Conversation
49bc8ac
to
81736ce
Compare
The purpose of splitting this out into two commits is to show which tests fail and (along with the corresponding behaviour) would need to change. This comment in particular worries me, but the parent PR #391 is a bit too intimidating to wade through looking for more context. |
Codecov Report
@@ Coverage Diff @@
## main #574 +/- ##
==========================================
- Coverage 93.15% 93.04% -0.11%
==========================================
Files 15 15
Lines 891 892 +1
==========================================
Hits 830 830
- Misses 61 62 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
I'm not sure about this. I think we were doing this in the original draft of ProjectTo and removed it. @willtebbutt @mcabbott thoughts? |
complex_tangent = Tangent{ComplexF64}(; re=1, im=NoTangent()) | ||
@test ProjectTo(1.0f0 + 2im)(complex_tangent) === 1.0f0 + 0.0f0im | ||
@test ProjectTo(1.0)(complex_tangent) === 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreleated fix?
@@ -212,7 +212,7 @@ struct NoSuperType end | |||
@test ProjectTo(I)(123) === NoTangent() | |||
@test ProjectTo(2 * I)(I * 3im) === 0.0 * I | |||
@test ProjectTo((4 + 5im) * I)(Tangent{typeof(im * I)}(; λ = 6)) === (6.0 + 0.0im) * I | |||
@test ProjectTo(7 * I)(Tangent{typeof(2I)}()) == ZeroTangent() | |||
@test ProjectTo(7 * I)(Tangent{typeof(2I)}()) == 0.0I |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm not sure that this is a good thing.
Does this also happen to other matrix types?
This was certainly the intention. The strong zero should mean no further computation is done.
But why does this issue imply that we should not preserve AbstractZero? I don't see the connection at all, but maybe I miss something. Can you spell it out? There are some cases where not creating an AbstractZero seems like a good idea (e.g. for type-stability) but not preserving them seems odd to me. |
The sequence of events was that fixing FluxML/Zygote.jl#1284 required switching keyword argument indexing to use (a modified) path for NamedTuples instead of Dicts. However, this breaks https://github.com/FluxML/Zygote.jl/blob/v0.6.43/test/features.jl#L528 because Based on this discussion, it seems like the two options are to change the test and call the existing behaviour a bug, or implement ad-hoc projection logic just for that particular adjoint. I am certainly not qualified to engage in the discussion on zero types brought up in this PR :) |
That test was added fairly recently, in FluxML/Zygote.jl#1059 . I'd probably expect a hard Zero there (i.e. |
Currently, structured zero tangents are allowed to pass through during projection for (I believe) all number types. This was found while trying to write a rrule that passes https://github.com/FluxML/Zygote.jl/blob/v0.6.43/test/features.jl#L528 (itself found while working on FluxML/Zygote.jl#1284). This PR tries to take a conservative first step towards coercing those zeroes back to natural tangents by focusing on a tangent type which is unambiguously zero (
ZeroTangent
) and a relatively easy-to-comprehend numeric subspace (theReal
s).