Some ideas about leader election_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-08 11:05 出处：网络

I am trying to perform leader election. These days I am thinking of using a key-value store to realize that but I am not quite sure if the idea is reliable as for scalability and consistency issues. The real deployment will have thousands of nodes and the election should take place without any central authority or service like zookeeper.

Now that, my question is:

Can I use a key-value store(preferably C-A tunable like riak) to perform the leader election? What are the possible pros/cons of utilizing a KV store for leader election?

Thanks!

EDIT: I am not interested in bully algo开发者_运维技巧rithm approach anymore.

A key-value store that does not guarantee consistency (like Riak) is a bad way to do this because you could get two nodes that both think (with reason!) that they are the new leader. A key-value store that guarantees consistency won't guarantee availability in the event of problems, and availability is going to be compromised exactly when you've got problems that could cause the loss of nodes.

The way that I would suggest doing this for thousands of nodes is to go from a straight peer to peer arrangement with thousands of nodes to a hierarchical arrangement. So have a master and several groups. Each incoming node is assigned to a group, which assigns it to a subgroup, which assigns it to a sub-sub group until you find yourself in a sufficiently small peer group. Then master election is only held between the leaders of the groups, and the winner gets promoted from being the leader of that group. If the leader of a group goes away (possibly because of promotion), a master election between the leaders of its subgroups elects the new leader. And so on.

If the peer group gets to be too large, say 26, then its master randomly splits it into 5 smaller groups of 5 peers each, with randomly assigned leaders. Similarly if a peer group gets too small, say 3, then it can petition its leader to be merged with someone else. If the leader notices that it has too few followers, say 3, then it can tell one of them to promote its subgroups to full groups, and to join one of those groups. You can play with those numbers, depending on how much redundancy you need.

This will lead to more elections, but you'll have massively reduced overhead per election. This should be a very significant overall win. For a start randomly confused nodes won't immediately start polling thousands of peers, generating huge spikes in network traffic.