Using Envoy as a front proxy

The Deephaven Open API and Web interface require connecting to the web_api_service and Deephaven query workers. The Swing Console requires direct connections to multiple Deephaven services in addition to the query workers.

This architecture allows Deephaven to scale to many users and machines without introducing a single process to handle all of the network traffic. However, this can be inconvenient from a network management perspective because API clients must be permitted to connect to many different hosts and ports on the query cluster.

To simplify network management, a front proxy can be exposed to client machines and route the traffic to the Web API Service, query workers, and other services. Deephaven uses Envoy (https://envoyproxy.io) as a front proxy. Envoy is a scalable open-source network proxy originally developed by Lyft which supports dynamic configuration using gRPC calls.

Since Envoy routes all inbound and outbound network traffic through the proxy, this may impact performance, particularly on high-throughput systems.

img

To configure Envoy for use with Deephaven, select a host for installation. Appropriate installation directions for various environments can be found here.

Configuration

In order for Envoy to act as a proxy for Deephaven services it must be configured to listen for incoming traffic on a specific address and port. It then must be configured to use the Deephaven Configuration Service as a Discovery Service (called an xDS) for routes to individual Persistent Queries, Query Workers, and other services. Specifically, Deephaven exports a Cluster Discovery Service (CDS) and Route Discovery Service (RDS) to Envoy.

The Deephaven Services and each Deephaven worker are defined as clusters in the dynamic Envoy configuration. Deephaven’s RDS creates routing rules that map various paths to specific workers and other services. Any path that does not match a worker or service prefix is directed to the Web API service.

There are two ways Deephaven and Envoy can be configured to work together: manually or automatically. Manual configuration details are in sections below on this page. Automatic configuration is largely covered in the Deephaven installation topic: Installation and upgrade guide and specifically in this paragraph: DH_CONFIGURE_ENVOY.

When configuring Envoy automatically, the installer will set needed properties in the iris-endpoints.prop file, and will also configure Envoy settings in getdown.global to direct the Client Update Service to use Envoy as well. The installer will also generate an Envoy configuration YAML file (envoy3.yaml). By default the installer will attempt to copy this file to the Deephaven node where Envoy will be running; if Envoy will be running on a separate system, the file copy step can be disabled by setting DH_ENVOY_COPY_YAML to false in the cluster.cnf file. Additionally, an alternate location for Envoy can be specified by setting the DH_ENVOY_FQDN property in cluster.cnf.

Example Envoy Configuration File

An example Envoy YAML configuration file suitable for use with Deephaven follows. Instructions are provided later where this file should be located, and how to make it available to Envoy.

Note

For installations that do not use the installer-generated Envoy configuration YAML, this example file can be used as a basis for the configuration file.

Most of the configuration in the example above can be left alone. However, there are a few places that might require more specific configuration.

Static resources

Clusters

In the static_resources section, configure the target address and port on which the Configuration Service will be listening for xDS requests.

If the Configuration Service is running on a different node than Envoy, or if Envoy is running inside a Docker container (as per the recommended configuration), change this to the appropriate address. For example, on a host with address 10.128.0.123 running Envoy inside a docker container, the address should be set to 10.128.0.123. It might be more appropriate, or just easier, to use a DNS name. If this is desired, change the cluster type from STATIC to LOGICAL_DNS and then replace the address with a fully qualified domain name. The port must be set to match the Deephaven system property ‘envoy.xds.port‘, which is set to 8124 by default.

Listeners

The listeners section tells Envoy what addresses and ports it should listen on for forwarding. Deephaven will only use a single address and port for these forwarding requests. In the example configuration, the address 0.0.0.0 and port 8000 indicates that Envoy should listen for connections from ANY interface on port 8000. Users will be able to connect to Deephaven using the URL:

https://<<myFQDN>>:8000/iriside

This port can be changed to any available port, and it is also possible to set the address to something more specific to constrain where connections are allowed from.

TLS certificates

When using TLS, Envoy needs to be aware of the TLS Certificate chain and private key file to be used to complete the TLS handshakes. This is configured by the common_tls_context stanza. The example configuration uses Deephaven’s default SSL truststore and private key for lighttpd which is installed at /etc/ssl/private. Administrators should consider installing their own trust store and key for Envoy.

Administration

Administrators may like to set up the Envoy administration port to be able to inspect the currently running Envoy configuration, which includes the currently configured routes and clusters. Be careful when this is enabled as it allows access to internal Envoy configuration. Deephaven recommends this be disabled in production configurations.

Installation via Docker

While it is possible to install Envoy by compiling from source, or using a package manager such as yum or apt, Deephaven recommends deploying Envoy as a Docker container. This makes configuration and updates simple since the Envoy project publishes pre-built Docker images.

First, ensure that Docker is installed and running on the host. Next, download the latest Envoy proxy image:

Create an Envoy YAML file. The example Envoy configuration file provided earlier is valid for a typical Deephaven cluster. This example assumes that the YAML file will be placed in /etc/sysconfig/illumon.d/resources, and is called envoy3.yaml to indicate it is an Envoy version 3 file.

As described in the Clusters section above, since Envoy is being run in a Docker container, the address in the endpoint section of the xds_service cluster must be changed. The IP address will be used instead, since it is not expected to change.

Next, create the docker container. This container will be named and reused each time envoy needs to start.

  • The YAML file created above will be used.
  • The Client Update Service's pemfile will be used for the certificate (it is recommended to create a different one for a production installation).

This command configures a few things for the container:

  • -d -> Run the container in detached mode (as a background process).
  • -p 8000:8000 -> map internal port 8000 to host port 8000.
  • -p 8001:8001 -> map internal port 8001 to host port 8001 (this is omitted if the admin port will not be used).
  • -v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml -> Mount the host’s /etc/sysconfig/illumon.d/resources/envoy3.yaml file as /config.yaml within the container so that the Envoy instance can see it.
  • -v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem:/lighttpd.pem -> Mount the tls key file into the container.
  • -u 9002 -> Run as irisadmin. Note that the irisadmin user ID may be different. Run id -u irisadmin to double-check the user ID. If the system is running the core infrastructure processes such as the Persistent Query Controller as a different user, that user's ID should be used instead. The key consideration is that the Envoy process must have rights to read the configuration YAML and certificate files.
  • --name deephaven_envoy -> names the container.
  • envoyproxy/envoy:v1.20.1 -c /config.yaml -> Launch the envoy 1.20.1 container using the mounted config.yaml file as the config source.
  • --add-host=hostname.company.com:1.2.3.4 -> adds an entry to the docker container’s /etc/hosts file. This is useful when DNS resolution is not configured inside the docker image, but a host name is used in the configuration YAML. This example does not do this.

Running Docker containers, including the Envoy container, can be displayed with:

If the container does not start, run it interactively to see the output. Replace -d with -it.

To delete the container (for example, in order to recreate it), use the following command:

Logs for the container can be displayed with the following command. Either the container ID or the name can be used. The second example shows the logs on or after the specified date.

The following command stops the container.

Configuring Envoy as a system service

To run using systemd, create the file /etc/systemd/system/envoy.service.

The following file contents assume that envoy is being run in a Docker container, as illustrated above. The command will be the /usr/bin/docker run command, but must not include sudo.

After editing the file, reload the daemon process:

Further configure Envoy to automatically start up at system startup:

Start Envoy with:

View the systemd logs with:

Stop Envoy with:

Deephaven configuration

This section provides manual instructions in case automated (DH_CONFIGURE_ENVOY=true) installation is not being used. It describes how to configure Deephaven services to interact with Envoy.

This requires a number of additions to Deephaven properties. Changes are needed to both the iris-environment and iris-endpoints properties files, as well as the getdown.global configuration file.

Note

There is no longer an option to enable Envoy only for the Deephaven Web UI. Once Envoy is enabled for Deephaven, all services must be configured to use Envoy, or the Configuration Server service will fail to start.

Update iris-environment.prop

The changes below are needed for the file iris-environment.prop. These are the commands to export, edit, and reimport this file:

To enable Envoy, set the following properties at the global scope (add them to the property file). Replace the url with the fully-qualified url.

  • The global.websocket.server.enabled property is used by a number of services to decide whether to start a websocket listener that Envoy will use. When set at the global scope, all services that are capable will be fronted by Envoy.
  • The envoy.front.proxy.url property indicates the host and port on which Envoy is listening for connections.
  • By default, the configuration service listens for Envoy xDS requests on port 8124. Change this port by setting the envoy.xds.port property. Deephaven also assumes Envoy has been configured to listen for HTTP requests with SSL enabled (and strongly recommends this configuration, as the connection is used for authentication). If SSL must be disabled, set the envoy.terminate.ssl property to false.

Envoy will close websockets after 5 minutes of inactivity (by default). Other proxies in the network path can also close the connections. To prevent this, Deephaven sends a heartbeat to keep each connection alive. The heartbeat frequency can be changed with the property below. This is not normally needed, but the following property is used if necessary.

The following properties need to be available to the Configuration Server, because it runs the Envoy discovery service and needs to know how to route traffic to all eligible services.

  • Webapi.server.port is already set and does not need to be changed.
  • Webapi.server.host is probably not set and needs to be added; add it to the configuration file currently being edited. Unless the envoy server is running on a different server than the web server, this will be the same url as was added to envoy.front.proxy.url but without the port. This can be a global property.

Tell the Swing Console (and other client programs) to use Envoy. These properties should be added to the [service.name=iris_console|interactive_console] stanza in the property file.

Assign a websocket port to services that will be fronted by Envoy. These ports need to be unique per host. For example, the same port may be used for the Auth Server on each host where the service runs, but different ports must be assigned to each Query Dispatcher on a given host. The ports suggested here will work with most default installations.

Many of these properties must be visible to the Configuration Server in order to serve the Envoy Discovery Service, as well as to the target services, so the stanza definitions below include both the configuration service and the target service. Add each of the properties to its stanza. If the stanzas do not exist then add the stanza and the property.

Make sure that the client_update_service.host and configuration.server.host properties are updated to the fully qualified host name for the server where those processes run.

Each Query Dispatcher that will be available via Envoy will need a websocket port. Each process on a given host needs a unique port. These services appear in stanzas like the following. Each place where RemoteQueryDispatcherParameters.queryPort is set, also set RemoteQueryDispatcherParameters.websocket.port. Add the following properties (with stanzas) to the property file.

  • When running multiple dispatchers on a single node, each dispatcher will need its own port. This is typically the case for single-node installations, since both the query server and merge server run on the same node.
  • If there are multiple query servers but each one is on a different server, each query server can use the same port. This is typically the case for a large-scale installation.
  • The example shown here uses different ports for the merge server and the query servers, and should work for both the above cases.

Update iris-endpoints.prop

Services that need to know about all the dispatchers at once (specifically the Persistent Query Controller) use a different mechanism than the above stanzas; all the information is in that process's stanza (see Persistent Query Controller for further details). The Configuration Service now needs the same information, and the websocket port needs to be added.

Note

The iris-endpoints.prop file is regenerated each time the Deephaven installer is run - e.g. when adding nodes to the cluster, or upgrading to a newer Deephaven version. Any changes to this file must be reapplied manually after such a reconfiguration or upgrade. Alternatively, follow the instructions further down in this section to apply changes to the iris-environment.prop file, which is not regenerated.

Option One - Update iris-endpoints.prop

There will be a section like the following in iris-endpoints.prop which defines all the remote query dispatchers available to the controller. The RemoteQueryDispatcherParameters.websocket.port property active in a given scope will be the default value for all iris.db.n.websocket.port settings. Only those that are different need to be specified. This usually means that the merge servers will need this property, but the query servers will not.

The changes below are needed for the file iris-endpoints.prop. These are the commands to export, edit, and reimport this file:

In iris-endpoints.prop, add configuration_server to the stanza that includes iris_controller and controller_tool. For merge servers add their websocket port to their iris.db properties. The following example shows a typical configuration for a three-node cluster:

Option Two - Update iris-environment.prop

To make the websocket property value(s) available to the Configuration Server through iris-environment.prop, add a stanza like this:

Note that this is presenting just the one property that was added to the larger block in the iris-endpoints.prop option section. If there are multiple merge servers in the system, each will need a similar property entry in this stanza. Further, if the number of servers in the system is later changed, it will be necessary to manually update this stanza in iris-environment.prop to ensure it still has correct entries of websocket value for each merge server by server number.

For example, a stanza for a system with two merge servers and three query servers might look like this:

Set up the client update service to tell swing clients to use Envoy

  • Clients cannot yet get properties files via envoy. Add the following line to /etc/sysconfig/illumon.d/client_update_service/getdown.global so that the Swing client will read properties files from disk. This ensures that all users who log in to this instance will be able to correctly read property files without making additional changes. jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional
  • Update the existing appbase line to have the appropriate port and address, typically ending in :8000/iris/
  • Update the existing ui.install_error line to have the appropriate port and address, typically ending in :8000/cus/error.html

Edit getdown.global. Note that this file is not accessed by the configuration service so it does not need to be exported and imported.

For example:

Edit getdown.port and change it to port 8000.

Restart all services on all nodes.

Possible additional settings

There are several places where message sizes can become larger than the default configured maximum for a websocket. If this happens, a message will be logged in the process's log file that looks like the following. The message indicates the current maximum and the actual larger size.

A work around for this is to increase the limit using the property websocket.client.max.message.size. This applies to all client Envoy connections and will become the new default for client connections. It applies to all web servers expecting Envoy connections and to any other clients such as swing consoles. It is usually best to apply this at the global level rather than within a stanza. If a serviceId is appended to the property name, then the new limit applies only to that service.

The server message size defaults to the client size if not set, but can be changed individually if needed. Like websocket.client.max.message.size, a serviceId can be appended to limit the value to that service.

Extra Envoy XDS routes

Additional routes can also be added to the Envoy configuration to expose other services through the proxy.

For each route to be added, create a set of properties that define the extra route. These properties must be visible to the configuration server (either global or in a stanza visible to the configuration server). After making changes, the configuration_server must be restarted to make the changes visible.

This example route is named dis and will redirect requests for https://envoy-server.company-name.com:<envoy_port>/dis/anything to http://infra-1.company-name.com:8086/anything:

The properties defining an Envoy route are:

  • envoy.xds.extra.routes.<name>.host
    • Required. The name or address the Envoy service needs to reach the destination service. This could be different from the host that is used from outside the network.
  • envoy.xds.extra.routes.<name>.port
    • Optional. The port of the destination service. Defaults to 443 when TLS (Transport Layer Security) is enabled and 80 otherwise.
  • envoy.xds.extra.routes.<name>.prefix
    • Required. Paths beginning with this prefix will be routed to the service at host:port. Do not use "/", "/worker", "/cus", "/comm", "/iris", or other prefixes that create conflicts or ambiguity with Deephaven mappings.
  • envoy.xds.extra.routes.<name>.prefixRewrite
    • Optional. Defaults to /. This value is used to change the prefix from the requested URL before forwarding to the destination service. In the example above, a user would route to the DIS's internal server by requesting a URL such as https://envoy-server.company-name.com:8000/dis/config. This server does not know about the /dis part of the path, and envoy will strip the prefix and send http://infra-1-server.company-name.com:8086/config to the destination. (note the changed path and port)
  • envoy.xds.extra.routes.<name>.tls
    • Optional. Indicates whether the destination service uses TLS/HTTPS. This is independent of whether envoy is using TLS. Defaults to true.
  • envoy.xds.extra.routes.<name>.exactPrefix
    • Optional. If true, then the prefix and prefixRewrite values will be used exactly as specified in the envoy route configuration. If false (the default), extra routes will be added both with and without a terminating /. This produces the desired results with rare exceptions.

Appendix

Tested versions of Envoy

Deephaven has tested compatibility with Envoy version 1.19.0 and 1.20.1. Only 1.20.1 is certified with current versions of Deephaven.