You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The existing (*Memberlist).Join method can take a long time to complete for large clusters. The problem is exacerbated when some of the addresses to join are non-existent IPs and we end up waiting the TCPTimeout duration on each of them.
For example we've observed in grafana/mimir that a full join initiated while most of the cluster members are restarting and changing IPs may take as long as 25 minutes. Nodes which are in the middle of a (*Memberlist).Join cannot be gracefully shut down until Join returns.
Proposal
Add context.Context argument to (*Memberlist).Join and check it between pushPulling with each node.
Alternatively, if you don't want to break existing client, we can create a new method JoinContext which does the above.
I'm creating this issue to get feedback on the idea. After discussion I am happy to open a PR.
The text was updated successfully, but these errors were encountered:
Description
The existing
(*Memberlist).Join
method can take a long time to complete for large clusters. The problem is exacerbated when some of the addresses to join are non-existent IPs and we end up waiting the TCPTimeout duration on each of them.For example we've observed in grafana/mimir that a full join initiated while most of the cluster members are restarting and changing IPs may take as long as 25 minutes. Nodes which are in the middle of a
(*Memberlist).Join
cannot be gracefully shut down untilJoin
returns.Proposal
Add
context.Context
argument to(*Memberlist).Join
and check it betweenpushPull
ing with each node.Alternatively, if you don't want to break existing client, we can create a new method
JoinContext
which does the above.I'm creating this issue to get feedback on the idea. After discussion I am happy to open a PR.
The text was updated successfully, but these errors were encountered: