-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Async and deferred loaders #173
base: main
Are you sure you want to change the base?
Conversation
fe9b4aa
to
4027212
Compare
4027212
to
046a00b
Compare
lib/graphql/batch/executor.rb
Outdated
@@ -53,6 +53,19 @@ def resolve(loader) | |||
@loading = was_loading | |||
end | |||
|
|||
def defer(_loader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't used; is it needed at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably not needed, I added it to make it similar to resolve
.
lib/graphql/batch/loader.rb
Outdated
@@ -126,6 +136,36 @@ def fulfilled?(key) | |||
promise.pending? && promise.source != self | |||
end | |||
|
|||
def perform_on_wait(keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a loader opts into using perform_on_wait
, should defer
then always be called in perform
? Or is there a situation where defer
wouldn't be needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good question.
I can't think of any such situation. Given that the perform_on_wait
hook is always called first, all those async loaders will have their opportunity to run first, and then they can always defer and run at the very end. Perhaps we could simplify it so that the loader doesn't have to call defer
, that's done automatically. Perhaps a new module that a loader can include like AsyncAndDeferred
, which adds the on_any_wait
hook etc.
Not entirely sure that's better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that better, especially for the reason that it avoids a potential footgun of not calling defer
manually. @amomchilov also mentioned expressing this as a separate class, so maybe the module is the solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit with a GraphQL::Batch::Async
module. What do you think @amomchilov @swalkinshaw ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General idea
I really like this feature idea, and it's definitely addressing a real need that've we've repeatedly encountered.
Having the parallelism limited to "within" the batch loader has been a real issue, that unnecessarily holds back the execution of the promises for already-done work, until all work is done.
Documentation considerations
These are really tricky things, that I believe would be difficult for most devs to approach without significant prior experience in this area.
E.g. this is a really nuanced distinction, that I only understand because I've been in these particular weeds before :D
Actually, the http_loader example isn't great, since it's async/concurrent but not batching. It could start the futures in load, which is in the eager phase of resolution. This PR allows for batch loaders that are both async/concurrent and batching, and also concurrent with each other
I think adding more method-level docs on the relevant pieces would help, but also a more complete "guide" style docs, like on the README
Implementation considerations
I wonder if these different behaviours could be better expressed if they were two separate types? Naming is hard, but perhaps something like StreamableLoader
vs BatchLoader
?
Would love to hear your thoughts! Thanks for working on this
# At this point, all of the concurrent work has been started. | ||
|
||
# This converges back in, waiting on each concurrent future to finish, and fulfilling each | ||
# (non-concurrent) Promise.rb promise. | ||
operations.each_with_index.each do |operation, index| | ||
fulfill(operation, futures[index].value) # .value is a blocking call | ||
fulfill(operation, futures[index].value!) # .value is a blocking call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fulfill(operation, futures[index].value!) # .value is a blocking call | |
fulfill(operation, futures[index].value!) # .value! is a blocking call |
examples/http_loader.rb
Outdated
end | ||
|
||
def perform(operations) | ||
def perform_on_wait(operations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this perhaps be:
def perform_on_wait(operations) | |
def perform_on_any_wait(operations) |
To match the naming of Loader#on_any_wait
lib/graphql/batch/loader.rb
Outdated
@@ -126,6 +136,36 @@ def fulfilled?(key) | |||
promise.pending? && promise.source != self | |||
end | |||
|
|||
def perform_on_wait(keys) | |||
# FIXME: Better name? | |||
# Interface to add custom code to e.g. trigger async operations when any loader starts waiting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move these comments into the method docs above the def
? That way they show on hover, in generated docs, etc.
f5ee2ab
to
864d9af
Compare
864d9af
to
50227fe
Compare
@@ -42,13 +42,14 @@ def current_executor | |||
end | |||
end | |||
|
|||
attr_accessor :loader_key, :executor | |||
attr_accessor :loader_key, :executor, :deferred |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned with deferred
terminology conflating with GraphQL defer
, to which this isn't directly related. Can this remain consistent with the "async" terminology, or be expressed on some kind of lazy priority scale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. I'd like to use yield
and yielded
, which is what https://graphql-ruby.org/dataloader/parallelism uses, but that's because it uses fibers. We're conceptually "yielding control" here too.
WDYT @gmac @swalkinshaw @amomchilov ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is an internal implementation detail now, why not just call it async
? And it's more consistent with the Async
module naming.
We could even get rid of this entirely and just filter out loaders that don't include GraphQL::Batch::Async
? (though maybe that's worse for perf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, though I think that's a worse API. I think yield
is a nice fundamental API, that can be used beyond Async. Async builds on top of that (and the "perform on any wait thing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put a commit up showing what it looks like?
I won't have time to test this out before my last day at Shopify (tomorrow). Leaving it hanging here for someone to pick up or close. |
Very WIP.
This PR introduces two new concepts to loaders:
perform_on_wait
that triggers whenever any batch loader promise starts waiting. This allows for async work to be triggered early in the lazy processing step (but still after the eager processing step has completed).defer
, which tells the executor to resolve other non-deferred loaders before the current one, to allow for non-async / non-concurrent loaders to run in a blocking manner while async loaders can wait to block for their result until the last minute.See the examples/http_loader.rb file for how you'd do this. (I haven't tested that it works. Names not final).
This is inspired by https://graphql-ruby.org/dataloader/parallelism, which is a simpler solution, but based on fibers. Fibers are neat, but comes with other issues (e.g. around thread/fiber local state).
Before this PR, batch loaders could do concurrent operations within themselves. With this PR, batch loaders can do concurrent operations across all batch loaders.
(Actually, the http_loader example isn't great, since it's async/concurrent but not batching. It could start the futures in
load
, which is in the eager phase of resolution. This PR allows for batch loaders that are both async/concurrent and batching, and also concurrent with each other).