-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memcache error #737
Comments
I'll try to recalculate all failed when then the heavy workload is over. Maybe this is related to limited resources and will go away when we move the workers to a more powerful machine. |
For the published surface 944 "error 21" happened:
but also error 15. |
This seems to be a memory issue, because if retrigger single analyses after the heavy workload as ended, everything is fine, no more error. So I'll shift this for checking later and we need to start the workers of version 0.16.1 on extended resources. |
Maybe this is related to this issue: antonagestam/collectfast#103 We could try to replace Another idea: In all examples here, the error happens while handling the progress meter. |
I don't understand why thread safety is an issue. Django is not multi threaded (in synchronous mode st least) if I understand correctly. |
Yes. The tasks run under Celery which may handle things differently. However I also don't think this is an issue, I guess this is related to the progress meter (#755). I'll have a look now. |
Indeed, this is not a memory error. It sometimes happens again on the new machine with significantly larger memory and also for a small measurement. It does not happen deterministically - if I restart related analyses it worked (so far) in two cases, in both the jumping progress meter (#755) was involved. |
Currently this happens again and again and makes some analyses fail. |
Sounds good, please release |
Unfortunately, after installing 0.18, there are still problems causing a memcache error:
Shifting to 0.19. |
This is probably related, similar issues appeared in the Zulip project: zulip/zulip#14456 This a maybe a blueprint for a fix: |
From here: moby/moby#31208
|
Also here: vapor/postgres-kit#164 (comment)
|
Thanks. I'm already using this endmode, at least for rabbitmq and memcached, but not yet for the celeryworker (there were problems), I will try again. I will definitely try the keepalive params, this looks promising! |
Although I'm using the keepalive parameters now, this still happens, at least when I try to generate thumbnails for the largest topography (8k x 8k):
Maybe related to zulip/zulip#534 . |
We should remove Selenium. Do you need help doing that? |
Memcache and RabbitMQ have been replaced by Redis. |
Error message in production in some analyses results for the large surface UNCD:
Not sure yet, what error 15 means. Happened during high load on the Celery workers.
The text was updated successfully, but these errors were encountered: