Skip to content

2. Start the HA server

Please keep in mind that plain HTTP should only be used for testing and debugging. It can be intercepted and modified on the fly to DDoS your cluster, cause server reboots, and lock up the whole HA system which will require a lot of work to revert back to a normal state. Always use some kind of encryption and avoid plain HTTP in production.

Debug mode

To start the HA server in a debug mode, execute the command below:

hoster api start --ha-mode --ha-debug

Debug mode gives us the ability to test the HA within our current deployment:

  • Is the network connection stable enough to support HA?
  • Are the servers behaving in a predictable manner?
  • Would there be enough disk and network bandwidth to support the replication and HA at the same time?
  • Etc

In the debug mode we can answer all these questions, because any action that would have been applied in the production cluster, would be simply logged instead of the immediate execution:

  • HA watchdog will gracefully exit, and not reboot the system if hoster_rest_api service has crashed
  • HA cluster manager will log the cluster failover actions, and not execute them (failing over the VMs from an offline host to a healthy host)
  • Etc

Check the HA logs:

tail -f /var/log/messages

It will give you an overview of what currently happens in the cluster:

Sep 26 22:13:39 IM-HYP01 HOSTER_HA_REST[76537]: INFO: node failover time is: 20 seconds
Sep 26 22:13:39 IM-HYP01 HOSTER_HA_REST[79124]: DEBUG: hoster_rest_api started in DEBUG mode
Sep 26 22:13:39 IM-HYP01 HOSTER_HA_REST[80879]: DEBUG: hoster_rest_api service start-up
Sep 26 22:13:39 IM-HYP01 HOSTER_HA_WATCHDOG[82430]: DEBUG: ha_watchdog service start-up
Sep 26 22:13:39 IM-HYP01 HOSTER_HA_REST[83634]: DEBUG: ha_watchdog started in DEBUG mode
Sep 26 22:13:46 IM-HYP01 HOSTER_HA_REST[87767]: INFO: registered a new node: IM-HYP03
Sep 26 22:13:48 IM-HYP01 HOSTER_HA_REST[89926]: INFO: registered a new node: IM-HYP02
Sep 26 22:13:49 IM-HYP01 HOSTER_HA_REST[92037]: INFO: registered a new node: IM-HYP01
Sep 26 22:13:49 IM-HYP01 HOSTER_HA_REST[92641]: SUCCESS: joined the candidate: IM-HYP01
Sep 26 22:13:49 IM-HYP01 HOSTER_HA_REST[94813]: SUCCESS: joined the candidate: IM-HYP02
Sep 26 22:13:49 IM-HYP01 HOSTER_HA_REST[97049]: SUCCESS: joined the candidate: IM-HYP03

Check the API logs:

hoster api show-log

It will give you an overview of the latest API endpoint connections (usually /api/v1/ha/ping and /api/v1/ha/register when it comes to HA):

2023-09-27_14-12-32 || 10.5.199.81:56921 || 200 || POST || /api/v1/ha/ping || 20.323µs || bytesSent: 18
2023-09-27_14-12-32 || 10.5.199.85:43967 || 200 || POST || /api/v1/ha/ping || 13.159µs || bytesSent: 18
2023-09-27_14-12-32 || 10.5.199.33:35179 || 200 || POST || /api/v1/ha/ping || 15.632µs || bytesSent: 18
2023-09-27_14-12-34 || 10.5.199.81:63889 || 200 || POST || /api/v1/ha/ping || 20.294µs || bytesSent: 18
2023-09-27_14-12-34 || 10.5.199.85:24646 || 200 || POST || /api/v1/ha/ping || 11.942µs || bytesSent: 18
2023-09-27_14-12-34 || 10.5.199.33:25478 || 200 || POST || /api/v1/ha/ping || 8.734µs || bytesSent: 18
2023-09-27_14-12-36 || 10.5.199.81:35923 || 200 || POST || /api/v1/ha/ping || 25.108µs || bytesSent: 18
2023-09-27_14-12-36 || 10.5.199.85:13987 || 200 || POST || /api/v1/ha/ping || 11.294µs || bytesSent: 18
2023-09-27_14-12-36 || 10.5.199.33:55854 || 200 || POST || /api/v1/ha/ping ||  7.71µs || bytesSent: 18
2023-09-27_14-12-38 || 10.5.199.81:31933 || 200 || POST || /api/v1/ha/ping || 22.195µs || bytesSent: 18
2023-09-27_14-12-38 || 10.5.199.85:28218 || 200 || POST || /api/v1/ha/ping || 12.353µs || bytesSent: 18
2023-09-27_14-12-38 || 10.5.199.33:38328 || 200 || POST || /api/v1/ha/ping || 14.97µs || bytesSent: 18
2023-09-27_14-12-40 || 10.5.199.81:37120 || 200 || POST || /api/v1/ha/ping || 24.954µs || bytesSent: 18
2023-09-27_14-12-40 || 10.5.199.33:35276 || 200 || POST || /api/v1/ha/ping || 8.544µs || bytesSent: 18

Production mode

After observing the debug mode for some period of time, you can move to production (if no issues were found in the process, of course):

hoster api start --ha-mode

You can still view the same logs, and observe the changes in the cluster that way, but now if there are any issues spotted - a problematic node will be fenced from the rest of the cluster (rebooted, and waiting for manual interaction from you, to add it back to the cluster), and VMs/Jails will be automatically migrated to other (healthy) members.