UI Slowness, Nodes temporarily showing Offline, or Instances temporarily showing Error

Scenarios

  1. Loading the UI seems to be very slow
  2. Node status all seem to be delayed or alternating between different states randomly
  3. Instance states all seem to be delayed or alternating between different states randomly

Confirm an average_queue_time of > 5 when curling the controller’s api/v1/stats endpoint

Common Causes

  1. Lots of VMS, Nodes, and not enough load balancing/instances of the controller, etcd, etc.
  2. ETCD is very sensitive to disk latency; not using SSDs for the etcd storage

Solutions:

  1. Increase the space quota from the default 2GB: [https://etcd.io/docs/current/op-guide/maintenance/#space-quota]
  2. Rejoin your nodes with a higher --heartbeat value (> 20s)
  3. Upgrade the host’s disk to an SSD or faster disk
  4. Set ANKA_NUM_WORKERS to more than the default 2 in the Controller configuration.

State changes can also be caused by network issues between the Node and the controller. Check the node’s /var/log/veertu/anka_agent.ERROR log to confirm you’re not seeing timeouts or connection errors.

Still experiencing problems?

Talk to us! we are available via slack or email