-
-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Django's async ORM API #2090
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @bigfootjon, yes, so good idea.
It's not quite as simple as folks just switching to use the async interface I think, because database_sync_to_async
also clears up old database connections (before and after the database interaction) mimicking Django's request life-cycle signal handlers (for request started and request finished).
We can adapt the advice here for Async Consumers.
For short-lived consumers we could say something like, "do that too!".
I need to have a think about longer-lived consumers, where is not 100% clear you want to hold an open DB connection the entire time... 🤔
(Need more coffee to think about that ☕️)
Maybe I should slightly tweak this PR then: introduce an async “clean_up_db_connections” decorator and then document that |
@bigfootjon I'm not 100% sure what we should say here. In theory you have one connection per thread, and in the pure async case you have just one thread. So, why not hold onto the connection? (Likely you'd need a way to recover from connection errors, and such but as a first pass...) Trouble is, we're still relying on the sync core underneath the async ORM API, so we're still going to be wanting to clean up connections. Without building some test projects I can't say exactly when and what is going to be needed there. I think that's what makes it interesting 😜 |
Makes sense. I’ll noodle on this for a bit and see what I can figure out (I only really have time to work on Django in my spare time, and I’ll be a bit limited on spare time for the next few weeks) |
@bigfootjon Slowly, slowly is the Django way. No stress! 😉 |
I've done a bit of tinkering and a bit of thinking and I think I've come up with a solution I'm happy with (though I'm eager to hear competing opinions). TheoryConceptually, the purpose of cleaning up connections is a necessary piece of the machinery of Django due to how connection health works within Django:
Based on my understanding, there are no "timers" within Django to periodically check connection health. Connection health is ensured at the start (and end) of a connection. So for channels, I think we want to clear connections at the beginning of a request ( The other time they would need a database connection would be at As a convoluted example, there might be changes to a server-side "file system" stored in Django Models where each Folder stores a ForeignKey ref to its parent folder. In response to some server-side event, the client would need updates on all folders in a path. As an implementation detail, the server might implement this as (pseudo-python):
In such a configuration closing the connection after every To summarize, CONN_MAX_AGEI think there's a problem with the solution outlined above: connections usually have a bounded lifetime. If a connection runs for forever the database connection might hit its CONN_MAX_AGE. In these events we should close the connection and make a new one. This would be handled automatically in the above life-cycle events except Imagine a scenario where a websocket is set up for client-side notifications of mutations to objects stored in the ORM. The developer has configured a consumer that does nothing on connect or receive, but every once in a while it wakes up (via an MQTT notification or something) and sends down the objects that it was notified about. Let's say there's some database query that must be done before sending down the object (like "is this content visible to the current user" checks) but these MQTT notifications are rare (1 hour or more). In this case, if the CONN_MAX_AGE on the database connection is less than an hour then we've violated the expectations of users by keeping their connections open longer than they specified. The implicit user contract in the above docs is that a connection will not be used after CONN_MAX_AGE. (ok ok I suppose that the max age could be reached during a traditional django request, but since traditional http requests are so fast I'm ignoring that). I don't really know how to solve this. I can see two solutions: A. Close Connections After First SendWhat if we close connections after the first The biggest problem is that if the connection is invalid that wouldn't help users as the first query they make would fail and an exception would be raised, which goes against user expectations ("I thought Django managed this for me"). The next biggest problem is that it has bad performance characteristics. If an user has a consumer that sends 2 I don't love this solution tbh. B. Document ItI think in the We can define a function (in channels or django core) that closes connections from async code and then document the contract that users need to uphold for themselves. The thesis of this documentation can be: call SummaryI'm vastly in favor of (B) and the implicit cleanup in |
Hey @bigfootjon -- thanks for putting your brain to it. Makes sense. I too lean toward B. If I've got a very long lived connection, I'm happy that it's my job to run a periodic task to clean up the connection object. Makes sense to me that way. |
424aada
to
4629311
Compare
Alright I've made a pass at integrating my ideas into Channels and documenting them! |
Hey @bigfootjon. I managed to get a look at this. Seems sound. I want to play with it a bit more but generally positive. Thinking currently about the migration path here. Say I'm currently using the existing wrapper, I guess we'll want to say "migrate, but the extra calls to close old connections won't hurt"? 🤔 |
Yeah I think that’s reasonable. My reading of the In any case, using the async versions of the ORM methods will be slightly faster in certain cases due to the ORM being able to sometimes hit caches before calling in to the sync DB internals, so there’s multiple incentives to use the newer method. |
Hey @bigfootjon -- I'm thinking this is a point release, so just working my way round to that. Not forgotten 😉 |
Hey @bigfootjon — I'm looking at #1091 in relation to this... — problem there is that code wrapped in (Adam has a mocking work around, making it a no-op) Fancy reading over that? Is it likely something exacerbated by what we're proposing here? It seems so... 🤔 I've had it in my notifications for a couple of weeks, but realise I should mention it to you, rather than sitting there not actually getting to look at it. 🤹 |
Yeah I can take a look! (Sometime over the next few days) |
No stress! 😅 I think we may need to tighten up the story there as part of this progress. Have a look, let me know what you think. |
Yeah I think I agree. I think #1091 will need to be resolved before we can move forward with this. My thinking is that since this PR introduces calls to I think the best path forward is https://code.djangoproject.com/ticket/30448 (if I understand the ticket correctly), but the So I see the rough work ahead as (ordered):
And we can close:
Then we could cut a release called something like "4.2: Better Async Django Support" once Django 5.1 is released (which would be needed before releasing #2093 in case anything changes in that API at the last minute before release) How does this plan sound? |
Hey @bigfootjon — yes, that sounds about right! (I'm head down lining up for DjangoCon Europe, but should have a bit more headspace after that. Do press on! 🎁) |
I've put up #2101 to tackle (1). I don't really have the bandwidth to pick up the upstream django ticket right now, but I'd like to see it done eventually. |
Hey @bigfootjon — right, I've come to the conclusion I'm just holding you up here with my lack of bandwidth to push this forwards. 🤹
As such I'm wondering if you'd accept joining as a Channels maintainer, so you can resolve issues, and we can push towards a release? I would be honoured to have you onboard. |
Hey @carltongibson, sorry for the slow reply (my American Independence Day celebrations ran a bit long). I'm definitely willing to become a maintainer, but some more info would be helpful to help me commit. I've emailed you at your GitHub account's public email address so we can discuss more directly and not dirty up this GitHub Issue with our discussion! |
4629311
to
412210c
Compare
@carltongibson: When you get a chance, this PR is ready for review. I'd particularly request a pass on the verbiage used in the documentation to make sure it has the right tone and explains what the user is expected to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, great. Yes. 👍
Again, I think when we come to the release we need to lead with the changes here for folks updating.
Yep, I'm on the same page with you about messaging during the release |
The async ORM interface was added in 4.1! Since channels supports >=4.2 we can document that new API.
Consider this a first pass, I just updated the spots I noticed that looked like they needed tweaking, happy to take feedback on wordsmithing or different ideas on how to document this.
Closes: #1999