-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leadership election stuck in 3 node cluster in Candidate state. #472
Comments
Juju also have experienced this. At some point, we'll also have to dig into why this is occurring. The workaround for us is to remove the raft directories from each node in the cluster and seed (bootstrap) one of them with leader state. |
@SimonRichardson Seems related now #31 (comment) Thanks @komuw |
It looks like checking for quorum is required to force a leader. Very interesting set of articles. |
Does the node that have only 1 peer (first config dump) ever know about the other peer? I have a feeling that the entry was lost long time back for other reason. |
@shou1dwe Yeah, it was aware of other peers. |
I read code , seems to have found a problem:
// Check if we have an existing leader [who's not the candidate] and also
// check the LeadershipTransfer flag is set. Usually votes are rejected if
// there is a known leader. But if the leader initiated a leadership transfer,
// vote!
candidate := r.trans.DecodePeer(req.Candidate)
if leader := r.Leader(); leader != "" && leader != candidate && !req.LeadershipTransfer
{
r.logger.Warn("rejecting vote request since we have a leader",
"from", candidate,
"leader", leader)
return
}
// Ignore an older term
if a.Term < r.getCurrentTerm() {
return
}
|
I'll have to page this back in, probably won't be that soon tbh. |
no problem. In Raft.replicateTo, the leader will end the term. |
Hey there, |
Had a similar issue after losing one peer
logs on peer
|
more debug logs:
|
@SimonRichardson Hi, I threw in the context, there are new thoughts about these bugs? |
@kmlebedev Sorry for the late reply, is this something you're still dealing with? Could you create a new issue please, as we're not certain it's related to the original one? |
@ankur-anand Sorry to take so long to respond. We see two different raft configurations on the two nodes you listed, in the first instance the configuration is incomplete. In that case it's hard to see how the node could make correct decisions. Previously you replied to someone else asking about this, saying
But the configuration it had clearly suggests otherwise: that first node only knew about one other peer. If you're still impacted by this issue, can you clarify why the configurations are different, as we think that's the root of the issue? |
I had 3 node clusters. One node died suddenly. Expected one of the node to become the leader. But seeing the logs. It's getting a vote from itself and asking from the dead node. But not from the other live server.
The Raft configuration from one of the servers is not listing the other live server.
Another Server.
Library version used. github.com/hashicorp/raft v1.3.1
Default config is taken from library default config.
The text was updated successfully, but these errors were encountered: