High Availability

Run backup nodes

It's likely that your IoTeX node could be in outage because of code bugs, insufficient compute resources, host failure, network failure and etc. Also, you may have to shut down your node to do some maintenance work or update the node software itself.

Running multiple IoTeX nodes is the best way to guarantee the high availability (zero downtime) of your delegation, that's why IoTeX nodes provide a utility feature to run in "backup mode" conveniently.

Configure the main and backup nodes

Let's assume that you want to run three nodes for your block producer delegate: one is the main node, i.e. the one that actively participates in the consensus work, and the other two are on "standby" and only listening to the blocks.

You can conveniently configure all of the three nodes with the same producerPrivKey private key setting, as long as you also add the following settings in config.yaml for all the nodes:

...
network:
  ...
  masterKey: producer_private_key-replica_id
  ...
...

while you add this setting for the main node only:

...
system:
  ...
  active: true
  ...
...

and this one for the standby nodes only:

...
system:
  ...
  active: false
  ...
...

Switch the active mode

As soon as your main node is down, make sure you manually switched the mode to standby" in the configuration file (this reduces the risk of running two active delegates, see below), then you can open the following URL to turn one of your backup nodes from standby into active mode:

http://ip-to-one-node:9009/ha?activate=true

Similarly, you can turn an active node into standby mode by using:

http://ip-to-one-node:9009/ha?activate=false.

Finally, if you only want to check the active status of a node, use the ULR below:

http://ip-to-one-node:9009/ha

Automatic Leader Election

If you run many nodes, and you want to get rid of the tedious manual operation, or just want to try out the fancy setup of a high availability cluster, you can take a look at this automatic leader election solution:

Last updated