-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add JobIteration::DestroyAssociationJob #140
base: main
Are you sure you want to change the base?
Conversation
3c79b4a
to
6d790be
Compare
test/test_helper.rb
Outdated
def destroy | ||
update!(deleted: true) | ||
run_callbacks(:destroy) | ||
run_callbacks(:commit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not obvious for me why this was needed at first glance, figured I'd mention for reviewers: since rails/rails#41093 the before_destroy
callback adds the job class and params to the parent model _after_commit_jobs
array, then the after_commit
callback actually enqueues the job.
4a97f97
to
3af3112
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nice, thank you for building it!
One related question, not for this PR so sorry about that: This async destroy machinery still requires the association record IDs to be loaded at owner object deletion time, and included in the job params. If an owner is associated with tens or hundreds of thousands of records, that might not work. In Shop we have a few Shop Users with hundreds of thousands of orders, these can't all be loaded in a single query if they ever were to delete their account. It'd be ideal then to just pass the association_class
+ owner_model_name
+ owner_id
, and load all data in this job.
Sorry for rant, PR LGTM.
end | ||
|
||
def each_iteration(record, _params) | ||
record.destroy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be destroy!
? I believe that's not what's in the upstream Rails job, but otherwise some records might silently not be destroyed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would get the job stuck if a single record cannot be deleted. It might be better to log the failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging for sure, perhaps even a AS::Notification topic to subscribe to so it can be hooked up to services like Bugsnag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destroy!
would halt the deletion of the remaining objects, raise an error (which would be reported to e.g. Bugsnag), and then retry. If we then fix the deletion validation bug, the retries would automatically proceed and no custom retry or maintenance job would be required. If we fix the deletion validation bug too late, after retries are exhausted, then a maintenance task needs to be run.
If we log or AS::Notification, then all services need to configure that to report to their Bugsnag equivalent, then the bug needs to be fixed, then some maintenance task needs to be run to delete what was missed.
Either way, we leave data in the DB that should be deleted. My preference is for destroy!
, leaving less manual toil, but I don't mind too much.
destroy
is what the Rails job does, so perhaps that's what should be done here. Perhaps it'd be even nicer to use the bang version on associations if the owner destroy was called with a bang.
That's not necessary, |
Fair point, worth opening an issue or PR on the Rails repo?
Sorry, I meant deleting in batches using |
@bdewater there's minimal benefit on using |
rails/rails#44617 addresses the pain point raised by @tgwizard and made me realize I still had this PR open collecting dust 😅 will circle back to this soon! |
@bdewater That PR is nice, though it doesn't solve the entire issue. If a Shop is deleted and the Shop has 1 million products, then ids.each_slice(owner.class.destroy_association_async_batch_size || ids.size) do |ids_batch|
enqueue_destroy_association( will still not work. There's no way to query for 1 million records in a timely manner within a single unit of work. We'd have to enqueue the job "delete all products for Shop X" instead of "delete all products in batch (1, 2, 3)" + "delete all products in batch (4, 5, 6)". |
Feels like the cascading deletion problem makes #63 relevant. |
Active Record 6.1 introduced the 'dependent: :destroy_async' option for associations with a configurable job to asynchronously destroy related models. This PR ports this job to use the Iteration API, making it interruptible and resumable.
3af3112
to
ad1e4c2
Compare
Active Record 6.1 introduced the
dependent: :destroy_async
option for associations with a configurable job to asynchronously destroy related models. This PR ports this job to use the Iteration API, making it interruptible and resumable.Open questions:
active_record_on_records
, should this job useactive_record_on_batches
with a batch size of e.g. 100?destroy_association_async_job
automatically?Closes #139