Persistent Query Controller
The Persistent Query (PQ) Controller, or Controller for short, is the Deephaven service responsible for starting, monitoring, and stopping PQs according to their specified configuration.
The Controller is a critical service. Starting PQs without a Controller is not possible, and if the Controller service halts due to a critical failure, already running PQs will shut down. Deephaven does not allow the execution of PQs outside Controller supervision for risk management reasons, as this would prevent manual or automated intervention to stop them during a production contingency.
Services provided
- Creation, validation, and storage of PQ definitions.
- Runtime control of PQs: automated and on-demand PQ start and stop.
- Management of PQ load balancing and failover via PQ replicas and spares.
- Updates of PQ status and configuration changes to clients interested in listening for PQ configuration and status changes.
Client authentication
Any program requiring Controller services needs to authenticate with the authentication server first. Once authenticated, the program requests an authentication token from the authentication server and provides it to the Controller to prove its credentials. The token is an opaque array of bytes from the client's perspective, which the Controller will validate by doing its own call to the authentication service to confirm the credentials. This operation is referred to as the "three-way handshake" between a client, the authentication server, and another Deephaven service (in this case, the Controller).
Implementation of Fault Tolerance
The Controller service is implemented by a group of servers, one of which is the leader, and the rest, if any, are followers. The number of servers is part of the system configuration and cannot be changed while the system is running; that is, you cannot add an additional controller server while the system is up.
The servers run a master election with the help of etcd, and the one elected acts as the service-only endpoint, servicing all requests, while the remaining servers, if any, go to stand-by mode as followers, still running but not servicing requests. In the event of leader failure, if there are any active followers one of them is promoted to leader and service continues uninterrupted, except for the potential of some client calls taking longer than average (a new election should be resolved in less than 30 seconds).
Note
The controller leader election is not a direct execution of a consensus algorithm, like the one implemented by the etcd servers themselves, but is done with support from etcd. As such:
- There is no restriction on the number of controller servers configured to be an odd number. The Controller service continues to operate as long as there is at least one controller server available (which should automatically become the leader). There is no need for the "half plus one" servers' active consensus, as required by algorithms like Paxos or Raft.
- The suggested number of servers to run in production is three, which allows for surviving one server failure while doing maintenance on another, for a total of two servers down.