Authentication Service

The Deephaven Enterprise Authentication Service manages the authentication of interactive users, client programs presenting user credentials, and Persistent Queries to other internal services in the Deephaven cluster. Authentication is a critical service; a Deephaven cluster cannot operate without it.

Services provided

  • Interactive users can log in by presenting a user ID and password.
  • Interactive users can log in using a third-party authentication plugin (e.g., LDAP).
  • Client processes and Persistent Queries can authenticate as a given user using public/private key pairs and get a token.
  • Client processes and Persistent Queries can maintain authentication by refreshing their provided cookie; if their cookie is not refreshed at a given configured period (normally 10 minutes), they lose authentication. This happens programmatically and transparently from a user's perspective, without the need for user intervention, as long as network connectivity is not interrupted for longer than the credentials are valid.
  • Client processes and Persistent Queries can prove their identities and credentials to other Deephaven services (e.g., the Persistent Query Controller) to perform operations. For example, a Python client program can ask the Persistent Query Controller to create a Persistent Query on behalf of its authenticated user, for which the client needs to prove its credentials to the controller.

Client authentication protocol

Any program that wants to perform an operation in a Deephaven cluster needs to prove and maintain valid authentication credentials with an authentication server. Here, "program" refers to a web UI like a Deephaven console or a client application like a C++ client.

Authentication implies presenting a user ID and means to prove ownership of that ID to the authentication server. Different methods to prove ownership include:

  • presenting the user's password.
  • performing secret key authentication by asking for a nonce and then signing the nonce with a key previously registered to that user in the system.
  • using a plugin for a third-party authentication mechanism, e.g., LDAP authentication to an Active Directory server.
  • a valid token shared from another program already authenticated (more about tokens later in this document).

On a successful authentication attempt, the server returns to the client an opaque array of bytes called a "cookie" alongside an expiration. Any subsequent calls from the client to the server require the client to present the cookie to prove credentials. The protocol also requires the client to send the server a "refresh cookie" request before the expiration, extending the cookie for a new period and effectively extending its expiration into the future. Periodic refreshes are thus necessary in the protocol and are the means for the server to establish the liveness of the authenticated program. The default expiration period for cookies is 10 minutes.

Implementation of Fault Tolerance

The authentication service uses a model of fully symmetric active replicas. In the default configuration, when clients need to make a call into the service, they use a round robin policy that contacts each available server in turn, which balances load across them. If one server goes down, clients will just go to the next one available in the configured list of servers that is still alive; as long as one server is still up, authentication is available, albeit at a reduced capacity.

The authentication server replicas keep a global authentication state written to etcd, which allows:

  • restarting a server that has crashed without losing context.
  • clients to perform operations that require more than a single call across different server replicas, e.g., private key authentication getting a nonce and doing a challenge response are two operations requiring pre-existing context that can be performed on different servers.