Using Envoy as a front proxy

The Deephaven Open API and Web interface require connecting to the web_api_service and Deephaven query workers. The Swing Console requires direct connections to multiple Deephaven services in addition to the query workers.

This architecture allows Deephaven to scale to many users and machines without introducing a single process to handle all of the network traffic. However, this can be inconvenient from a network management perspective because API clients must be permitted to connect to many different hosts and ports on the query cluster.

To simplify network management, a front proxy can be exposed to client machines and route the traffic to the Web API Service, query workers, and other services. Deephaven uses Envoy (http://envoyproxy.io) as a front proxy. Envoy is a scalable open-source network proxy originally developed by Lyft which supports dynamic configuration using gRPC calls.

Since Envoy routes all inbound and outbound network traffic through the proxy, this may impact performance, particularly on high-throughput systems.

img

To configure Envoy for use with Deephaven, select a host for installation. Appropriate installation directions for various environments can be found at https://www.getenvoy.io/.

Configuration

In order for Envoy to act as a proxy for Deephaven services it must be configured to listen for incoming traffic on a specific address and port. It then must be configured to use the Deephaven Configuration Service as a Discovery Service (called an xDS) for routes to individual Persistent Queries, Query Workers, and other services. Specifically, Deephaven exports a Cluster Discovery Service (CDS) and Route Discovery Service (RDS) to Envoy.

The Deephaven Services and each Deephaven worker are defined as clusters in the dynamic Envoy configuration. Deephaven’s RDS creates routing rules that map various paths to specific workers and other services. Any path that does not match a worker or service prefix is directed to the Web API service.

There are two ways Deephaven and Envoy can be configured to work together: manually or automatically. Manual configuration details are in sections below on this page. Automatic configuration is largely covered in the Deephaven installation topic: Installation and upgrade guide and specifically in this paragraph: DH_CONFIGURE_ENVOY.

When configuring Envoy automatically, the installer will set needed properties in the iris-endpoints.prop file, and will also configure Envoy settings in getdown.global to direct the Client Update Service to use Envoy as well. The installer will also generate an Envoy configuration YAML file (envoy3.yaml). By default the installer will attempt to copy this file to the Deephaven node where Envoy will be running; if Envoy will be running on a separate system, the file copy step can be disabled by setting DH_ENVOY_COPY_YAML to false in the cluster.cnf file. Additionally, an alternate location for Envoy can be specified by setting the DH_ENVOY_FQDN property in cluster.cnf.

Example Envoy Configuration File

An example Envoy YAML configuration file suitable for use with Deephaven follows. Instructions are provided later where this file should be located, and how to make it available to Envoy.

node: { id: 'envoynode', cluster: 'envoycluster' }

# This section tells Envoy that there is a dynamic cluster discovery service 'xds_service
# that is communicating via GRPC using the V3 API and data structures.
dynamic_resources:
  cds_config:
    resource_api_version: V3
    api_config_source:
      api_type: GRPC
      transport_api_version: V3
      grpc_services:
        envoy_grpc: { cluster_name: xds_service }

# This section tells envoy what servers are available to load balance to.  In the case of deephaven
# there is a single controller listening on port 8124 (or whatever it is configured to)

# This configuration assumes the web_api_service is running on the same host as Envoy.
# If envoy is instead running within a docker container, or another host the address should
# be updated.
static_resources:
  clusters:
    - name: xds_service
      connect_timeout: 0.25s
      type: static
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          '@type': type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_config:
            http2_protocol_options: {}
      load_assignment:
        cluster_name: 'xds_service'
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      # Update these values to match configuration server host and xds port
                      address: <enter-ip-here>
                      port_value: 8124

  # This section tells envoy to listen for incoming connections on port 8000 from anywhere
  # Upgrades them to websocket style connections and discover routes via a V3 GRPC interface
  listeners:
    - address:
        socket_address:
          # This address and port is the port to which clients will connect.
          # 0.0.0.0 indicates that Envoy should listen for -all- connections on port 8000
          # from any interface
          address: 0.0.0.0
          port_value: 8000
      filter_chains:
        - filters:
            - name: envoy.http_connection_manager
              typed_config:
                '@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: AUTO
                stat_prefix: egress_http
                upgrade_configs:
                  - upgrade_type: websocket
                rds:
                  route_config_name: rds_config
                  config_source:
                    resource_api_version: V3
                    api_config_source:
                      api_type: GRPC
                      transport_api_version: V3
                      grpc_services:
                        envoy_grpc:
                          cluster_name: 'xds_service'
                access_log:
                  - name: envoy.file_access_log
                    typed_config:
                      '@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                      path: '/tmp/envoy-rds.log'
                http_filters:
                  - name: envoy.filters.http.router
                    typed_config: {}
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_certificates:
                  - certificate_chain:
                      filename: '/lighttpd.pem'
                    private_key:
                      filename: '/lighttpd.pem'
admin:
  access_log_path: '/tmp/envoy.log'
  # Setting these will allow a sysadmin to dump Envoy configurations from localhost of the envoy
  # host.  It is recommended to disable this in production, as the admin interface can be used to
  # modify the running configuration
  # address:
  #   socket_address:
  #     address: 127.0.0.1
  #     port_value: 8001

Note

For installations that do not use the installer-generated Envoy configuration YAML, this example file can be used as a basis for the configuration file.

Most of the configuration in the example above can be left alone. However, there are a few places that might require more specific configuration.

Static resources

Clusters

In the static_resources section, configure the target address and port on which the Configuration Service will be listening for xDS requests.

If the Configuration Service is running on a different node than Envoy, or if Envoy is running inside a Docker container (as per the recommended configuration), change this to the appropriate address. For example, on a host with address 10.128.0.123 running Envoy inside a docker container, the address should be set to 10.128.0.123. It might be more appropriate, or just easier, to use a DNS name. If this is desired, change the cluster type from STATIC to LOGICAL_DNS and then replace the address with a fully qualified domain name. The port must be set to match the Deephaven system property ‘envoy.xds.port‘, which is set to 8124 by default.

Listeners

The listeners section tells Envoy what addresses and ports it should listen on for forwarding. Deephaven will only use a single address and port for these forwarding requests. In the example configuration, the address 0.0.0.0 and port 8000 indicates that Envoy should listen for connections from ANY interface on port 8000. Users will be able to connect to Deephaven using the URL:

https://<<myFQDN>>:8000/iriside

This port can be changed to any available port, and it is also possible to set the address to something more specific to constrain where connections are allowed from.

TLS certificates

When using TLS, Envoy needs to be aware of the TLS Certificate chain and private key file to be used to complete the TLS handshakes. This is configured by the common_tls_context stanza. The example configuration uses Deephaven’s default SSL truststore and private key for lighttpd which is installed at /etc/ssl/private. Administrators should consider installing their own trust store and key for Envoy.

Administration

Administrators may like to set up the Envoy administration port to be able to inspect the currently running Envoy configuration, which includes the currently configured routes and clusters. Be careful when this is enabled as it allows access to internal Envoy configuration. Deephaven recommends this be disabled in production configurations.

Installation via Docker

While it is possible to install Envoy by compiling from source, or using a package manager such as yum or apt, Deephaven recommends deploying Envoy as a Docker container. This makes configuration and updates simple since the Envoy project publishes pre-built Docker images.

First, ensure that Docker is installed and running on the host. Next, download the latest Envoy proxy image:

sudo docker pull envoyproxy/envoy:v1.20.1

Create an Envoy YAML file. The example Envoy configuration file provided earlier is valid for a typical Deephaven cluster. This example assumes that the YAML file will be placed in /etc/sysconfig/illumon.d/resources, and is called envoy3.yaml to indicate it is an Envoy version 3 file.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/resources/envoy3.yaml

As described in the Clusters section above, since Envoy is being run in a Docker container, the address in the endpoint section of the xds_service cluster must be changed. The IP address will be used instead, since it is not expected to change.

Next, create the docker container. This container will be named and reused each time envoy needs to start.

  • The YAML file created above will be used.
  • The Client Update Service's pemfile will be used for the certificate (it is recommended to create a different one for a production installation).
sudo /usr/bin/docker run -d \
   -p 8000:8000 \
   -p 8001:8001 \
   -v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml \
   -v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem:/lighttpd.pem \
   -u 9002 \
   --name deephaven_envoy \
   envoyproxy/envoy:v1.20.1 -c /config.yaml

This command configures a few things for the container:

  • -d -> Run the container in detached mode (as a background process).
  • -p 8000:8000 -> map internal port 8000 to host port 8000.
  • -p 8001:8001 -> map internal port 8001 to host port 8001 (this is omitted if the admin port will not be used).
  • -v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml -> Mount the host’s /etc/sysconfig/illumon.d/resources/envoy3.yaml file as /config.yaml within the container so that the Envoy instance can see it.
  • -v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem:/lighttpd.pem -> Mount the tls key file into the container.
  • -u 9002 -> Run as irisadmin. Note that the irisadmin user ID may be different. Run id -u irisadmin to double-check the user ID. If the system is running the core infrastructure processes such as the Persistent Query Controller as a different user, that user's ID should be used instead. The key consideration is that the Envoy process must have rights to read the configuration YAML and certificate files.
  • --name deephaven_envoy -> names the container.
  • envoyproxy/envoy:v1.20.1 -c /config.yaml -> Launch the envoy 1.20.1 container using the mounted config.yaml file as the config source.
  • --add-host=hostname.company.com:1.2.3.4 -> adds an entry to the docker container’s /etc/hosts file. This is useful when DNS resolution is not configured inside the docker image, but a host name is used in the configuration YAML. This example does not do this.

Running Docker containers, including the Envoy container, can be displayed with:

sudo docker container ls

If the container does not start, run it interactively to see the output. Replace -d with -it.

To delete the container (for example, in order to recreate it), use the following command:

sudo docker rm deephaven_envoy

Logs for the container can be displayed with the following command. Either the container ID or the name can be used. The second example shows the logs on or after the specified date.

sudo docker logs deephaven_envoy
sudo docker logs deephaven_envoy --since 2021-11-18

The following command stops the container.

sudo docker container stop deephaven_envoy

Configuring Envoy as a system service

To run using systemd, create the file /etc/systemd/system/envoy.service.

sudo vi /etc/systemd/system/envoy.service

The following file contents assume that envoy is being run in a Docker container, as illustrated above. The command will be the /usr/bin/docker run command, but must not include sudo.

# /etc/systemd/system/envoy.service
[Unit]
Description=Envoy Proxy
Documentation=https://www.envoyproxy.io/
After=network-online.target
[Service]
User=root
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/docker restart deephaven_envoy
ExecStop=/usr/bin/docker container stop deephaven_envoy
[Install]
WantedBy=multi-user.target

After editing the file, reload the daemon process:

sudo systemctl daemon-reload

Further configure Envoy to automatically start up at system startup:

sudo systemctl enable envoy

Start Envoy with:

sudo systemctl start envoy

View the systemd logs with:

sudo systemctl status envoy.service

Stop Envoy with:

sudo systemctl stop envoy

Deephaven configuration

This section provides manual instructions in case automated (DH_CONFIGURE_ENVOY=true) installation is not being used. It describes how to configure Deephaven services to interact with Envoy.

This requires a number of additions to Deephaven properties. Changes are needed to both the iris-environment and iris-endpoints properties files, as well as the getdown.global configuration file.

Note

There is no longer an option to enable Envoy only for the Deephaven Web UI. Once Envoy is enabled for Deephaven, all services must be configured to use Envoy, or the Configuration Server service will fail to start.

Update iris-environment.prop

The changes below are needed for the file iris-environment.prop. These are the commands to export, edit, and reimport this file:

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-environment.prop -d /tmp/

sudo -u irisadmin vi /tmp/iris-environment.prop

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-environment.prop -d /tmp/

To enable Envoy, set the following properties at the global scope (add them to the property file). Replace the url with the fully-qualified url.

global.websocket.server.enabled=true
envoy.front.proxy.url=user-facing-host.company.com:8000
envoy.terminate.ssl=true
  • The global.websocket.server.enabled property is used by a number of services to decide whether to start a websocket listener that Envoy will use. When set at the global scope, all services that are capable will be fronted by Envoy.
  • The envoy.front.proxy.url property indicates the host and port on which Envoy is listening for connections.
  • By default, the configuration service listens for Envoy xDS requests on port 8124. Change this port by setting the envoy.xds.port property. Deephaven also assumes Envoy has been configured to listen for HTTP requests with SSL enabled (and strongly recommends this configuration, as the connection is used for authentication). If SSL must be disabled, set the envoy.terminate.ssl property to false.

Envoy will close websockets after 5 minutes of inactivity (by default). Other proxies in the network path can also close the connections. To prevent this, Deephaven sends a heartbeat to keep each connection alive. The heartbeat frequency can be changed with the property below. This is not normally needed, but the following property is used if necessary.

envoy.front.proxy.keepalive.ping.ms=60000

The following properties need to be available to the Configuration Server, because it runs the Envoy discovery service and needs to know how to route traffic to all eligible services.

  • Webapi.server.port is already set and does not need to be changed.
  • Webapi.server.host is probably not set and needs to be added; add it to the configuration file currently being edited. Unless the envoy server is running on a different server than the web server, this will be the same url as was added to envoy.front.proxy.url but without the port. This can be a global property.
Webapi.server.host=web-api-host.company.com

Tell the Swing Console (and other client programs) to use Envoy. These properties should be added to the [service.name=iris_console|interactive_console] stanza in the property file.

    global.websocket.client.enabled=true
    WAuthenticationClientManager.defaultClientFactoryClass=com.illumon.iris.auth.WebsocketAuthenticationClient

Assign a websocket port to services that will be fronted by Envoy. These ports need to be unique per host. For example, the same port may be used for the Auth Server on each host where the service runs, but different ports must be assigned to each Query Dispatcher on a given host. The ports suggested here will work with most default installations.

Many of these properties must be visible to the Configuration Server in order to serve the Envoy Discovery Service, as well as to the target services, so the stanza definitions below include both the configuration service and the target service. Add each of the properties to its stanza. If the stanzas do not exist then add the stanza and the property.

Make sure that the client_update_service.host and configuration.server.host properties are updated to the fully qualified host name for the server where those processes run.

# auth server
[service.name=authentication_server|configuration_server] {
    AuthenticationServer.websocket.port=22050
}

# controller
[service.name=iris_controller|configuration_server] {
    PersistentQueryController.websocket.port=22051
}

#dbacl writer
[service.name=db_acl_write_server|iris_db_user_mod|configuration_server] {
    dbaclwriter.websocket.port=22053
}

#client update service and configuration server
[service.name=configuration_server] {
    client_update_service.host=cus-node.company.com
    client_update_service.port=8443
    # Note: the configuration server itself does not yet support websocket communication
    configuration.server.host=config-server-node.company.com
    configuration.server.websocket.port=22054
}

Each Query Dispatcher that will be available via Envoy will need a websocket port. Each process on a given host needs a unique port. These services appear in stanzas like the following. Each place where RemoteQueryDispatcherParameters.queryPort is set, also set RemoteQueryDispatcherParameters.websocket.port. Add the following properties (with stanzas) to the property file.

  • When running multiple dispatchers on a single node, each dispatcher will need its own port. This is typically the case for single-node installations, since both the query server and merge server run on the same node.
  • If there are multiple query servers but each one is on a different server, each query server can use the same port. This is typically the case for a large-scale installation.
  • The example shown here uses different ports for the merge server and the query servers, and should work for both the above cases.
# set the default websocket port for query servers
RemoteQueryDispatcherParameters.websocket.port=22052

# set a different default websocket for all merge servers
[service.name=dbmerge|db_dis_merge|tailer1_merge] {
    RemoteQueryDispatcherParameters.websocket.port=22060
}

Update iris-endpoints.prop

Services that need to know about all the dispatchers at once (specifically the Persistent Query Controller) use a different mechanism than the above stanzas; all the information is in that process's stanza (see Persistent Query Controller for further details). The Configuration Service now needs the same information, and the websocket port needs to be added.

Note

The iris-endpoints.prop file is regenerated each time the Deephaven installer is run - e.g. when adding nodes to the cluster, or upgrading to a newer Deephaven version. Any changes to this file must be reapplied manually after such a reconfiguration or upgrade. Alternatively, follow the instructions further down in this section to apply changes to the iris-environment.prop file, which is not regenerated.

Option One - Update iris-endpoints.prop

There will be a section like the following in iris-endpoints.prop which defines all the remote query dispatchers available to the controller. The RemoteQueryDispatcherParameters.websocket.port property active in a given scope will be the default value for all iris.db.n.websocket.port settings. Only those that are different need to be specified. This usually means that the merge servers will need this property, but the query servers will not.

The changes below are needed for the file iris-endpoints.prop. These are the commands to export, edit, and reimport this file:

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-endpoints.prop -d /tmp/

sudo -u irisadmin vi /tmp/iris-endpoints.prop

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-endpoints.prop -d /tmp/

In iris-endpoints.prop, add configuration_server to the stanza that includes iris_controller and controller_tool. For merge servers add their websocket port to their iris.db properties. The following example shows a typical configuration for a three-node cluster:

[service.name=iris_controller|controller_tool|configuration_server] {
iris.db.nservers=3

    iris.db.1.host=query-1.company.com
    iris.db.1.classPushList=
    iris.db.1.class=Query

    iris.db.2.host=query-2.company.com
    iris.db.2.classPushList=
    iris.db.2.class=Query

    iris.db.3.host=infra-1.company.com
    iris.db.3.classPushList=
    iris.db.3.port=30002
    iris.db.3.class=Merge
    iris.db.3.websocket.port=22060
}

Option Two - Update iris-environment.prop

To make the websocket property value(s) available to the Configuration Server through iris-environment.prop, add a stanza like this:

[service.name=configuration_server] {
    iris.db.3.websocket.port=22060
}

Note that this is presenting just the one property that was added to the larger block in the iris-endpoints.prop option section. If there are multiple merge servers in the system, each will need a similar property entry in this stanza. Further, if the number of servers in the system is later changed, it will be necessary to manually update this stanza in iris-environment.prop to ensure it still has correct entries of websocket value for each merge server by server number.

For example, a stanza for a system with two merge servers and three query servers might look like this:

[service.name=configuration_server] {
    iris.db.4.websocket.port=22060
    iris.db.5.websocket.port=22060
}

Set up the client update service to tell swing clients to use Envoy

  • Clients cannot yet get properties files via envoy. Add the following line to /etc/sysconfig/illumon.d/client_update_service/getdown.global so that the Swing client will read properties files from disk. This ensures that all users who log in to this instance will be able to correctly read property files without making additional changes. jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional
  • Update the existing appbase line to have the appropriate port and address, typically ending in :8000/iris/
  • Update the existing ui.install_error line to have the appropriate port and address, typically ending in :8000/cus/error.html

Edit getdown.global. Note that this file is not accessed by the configuration service so it does not need to be exported and imported.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.global

For example:

jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional

appbase = https://<server name>:8000/iris/

ui.install_error = https://<server name>:8000/cus/error.html

Edit getdown.port and change it to port 8000.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.port

Restart all services on all nodes.

sudo -u irisadmin monit stop all
sudo -u irisadmin monit start all

Possible additional settings

There are several places where message sizes can become larger than the default configured maximum for a websocket. If this happens, a message will be logged in the process's log file that looks like the following. The message indicates the current maximum and the actual larger size.

CloseReason[1009,Binary message size [338135] exceeds maximum size [65536]]

A work around for this is to increase the limit using the property websocket.client.max.message.size. This applies to all client Envoy connections and will become the new default for client connections. It applies to all web servers expecting Envoy connections and to any other clients such as swing consoles. It is usually best to apply this at the global level rather than within a stanza. If a serviceId is appended to the property name, then the new limit applies only to that service.

websocket.client.max.message.size=100000000

The server message size defaults to the client size if not set, but can be changed individually if needed. Like websocket.client.max.message.size, a serviceId can be appended to limit the value to that service.

websocket.server.max.message.size=100000000

Extra Envoy XDS routes

Additional routes can also be added to the Envoy configuration to expose other services through the proxy.

For each route to be added, create a set of properties that define the extra route. These properties must be visible to the configuration server (either global or in a stanza visible to the configuration server). After making changes, the configuration_server must be restarted to make the changes visible.

This example route is named dis and will redirect requests for https://envoy-server.company-name.com:<envoy_port>/dis/anything to http://infra-1.company-name.com:8086/anything:

    envoy.xds.extra.routes.dis.host=infra-1.company-name.com
    envoy.xds.extra.routes.dis.port=8086
    envoy.xds.extra.routes.dis.prefix=/dis/
    envoy.xds.extra.routes.dis.prefixRewrite=/
    envoy.xds.extra.routes.dis.tls=false
    envoy.xds.extra.routes.dis.exactPrefix=false

The properties defining an Envoy route are:

  • envoy.xds.extra.routes.<name>.host
    • Required. The name or address the Envoy service needs to reach the destination service. This could be different from the host that is used from outside the network.
  • envoy.xds.extra.routes.<name>.port
    • Optional. The port of the destination service. Defaults to 443 when TLS (Transport Layer Security) is enabled and 80 otherwise.
  • envoy.xds.extra.routes.<name>.prefix
    • Required. Paths beginning with this prefix will be routed to the service at host:port. Do not use "/", "/worker", "/cus", "/comm", "/iris", or other prefixes that create conflicts or ambiguity with Deephaven mappings.
  • envoy.xds.extra.routes.<name>.prefixRewrite
    • Optional. Defaults to /. This value is used to change the prefix from the requested URL before forwarding to the destination service. In the example above, a user would route to the DIS's internal server by requesting a URL such as https://envoy-server.company-name.com:8000/dis/config. This server does not know about the /dis part of the path, and envoy will strip the prefix and send http://infra-1-server.company-name.com:8086/config to the destination. (note the changed path and port)
  • envoy.xds.extra.routes.<name>.tls
    • Optional. Indicates whether the destination service uses TLS/HTTPS. This is independent of whether envoy is using TLS. Defaults to true.
  • envoy.xds.extra.routes.<name>.exactPrefix
    • Optional. If true, then the prefix and prefixRewrite values will be used exactly as specified in the envoy route configuration. If false (the default), extra routes will be added both with and without a terminating /. This produces the desired results with rare exceptions.

Appendix

Tested versions of Envoy

Deephaven has tested compatibility with Envoy version 1.19.0 and 1.20.1. Only 1.20.1 is certified with current versions of Deephaven.