-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sotw v3 server causing deadlock in LinearCache #530
Comments
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
I have opened #531 to address this. |
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
…roxy#530) Signed-off-by: Rueian <[email protected]>
…roxy#530) Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
no stalebot |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
no stalebot |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
no stalebot |
Signed-off-by: Rueian <[email protected]>
…roxy#530) Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
…roxy#530) Signed-off-by: Rueian <[email protected]>
Signed-off-by: Rueian <[email protected]>
I believe this issue has been solved in recent reworks changing the channel model of the server and cache. Can you confirm if this issue is still present in recent versions? |
The LinearCache will clear the watches under the
name
of changed resources innotifyAll
calls (L147):go-control-plane/pkg/cache/v3/linear.go
Lines 140 to 148 in 996a28b
However, just cleaning the the watches under the
name
is not enough. It needs to clean all the watches to the samechan Response
. Because the Sotw v3 server creates thechan Response
with only 1 buffer (L369):go-control-plane/pkg/server/sotw/v3/server.go
Lines 362 to 371 in 7e211bd
Consider the following sequence:
DiscoveryRequest
with 2 resource names, and calls thecache.CreateWatch
to the LinearCache.chan Response
provided by the sotw server with 2 watch entries corresponding to the requested resources.UpdateResource
is called with the first resource name.UpdateResource
is called with the second resource name and thechan Response
is blocked.DiscoveryRequest
but the LinearCache is still locked therefore they are deadlocked.The LinearCache could just maintain another
chan Response -> resource name
map for fast cleaning innotifyAll
call. But I think the root cause is that the sotw server uses a single goroutine to handle both streams of bi-di grpc.If it handled them in separated goroutines, there would be no such deadlock, and there might be even no need to recreate a new
chan Response
on eachDiscoveryRequest
and unregister watches on eachnotifyAll
call inLinearCache
.The text was updated successfully, but these errors were encountered: