Using Envoy as a front proxy

The Deephaven Open API and Web interface require connecting to the web_api_service and Deephaven query workers. The Swing Console requires direct connections to multiple Deephaven services in addition to the query workers.

This architecture allows Deephaven to scale to many users and machines without introducing a single process to handle all of the network traffic. However, this can be inconvenient from a network management perspective because API clients must be permitted to connect to many different hosts and ports on the query cluster.

To simplify network management, a front proxy can be exposed to client machines and route the traffic to the Web API Service, query workers, and other services. Deephaven uses Envoy (http://envoyproxy.io) as a front proxy. Envoy is a scalable open-source network proxy originally developed by Lyft which supports dynamic configuration using gRPC calls.

Since Envoy routes all inbound and outbound network traffic through the proxy, this may impact performance, particularly on high-throughput systems.

img

To configure Envoy for use with Deephaven, you must select a host for installation. Appropriate installation directions for your environment can be found here.

Configuration

In order for Envoy to act as a proxy for Deephaven services it must be configured to listen for incoming traffic on a specific address and port. It then must be configured to use the Deephaven Configuration Service as a Discovery Service (called an xDS) for routes to individual Persistent Queries, Query Workers, and other services. Specifically, Deephaven exports a Cluster Discovery Service (CDS) and Route Discovery Service (RDS) to Envoy.

The Deephaven Services and each Deephaven worker are defined as clusters in your dynamic Envoy configuration. Deephaven’s RDS creates routing rules that map various paths to specific workers and other services. Any path that does not match a worker or service prefix is directed to the Web API service.

Example Envoy Configuration File

An example Envoy YAML configuration file suitable for use with Deephaven follows. Instructions are provided later where this file should be located, and how to make it available to Envoy.

node: { id: 'envoynode', cluster: 'envoycluster' }

# This section tells Envoy that there is a dynamic cluster discovery service 'xds_service
# that is communicating via GRPC using the V3 API and data structures.
dynamic_resources:
  cds_config:
    resource_api_version: V3
    api_config_source:
      api_type: GRPC
      transport_api_version: V3
      grpc_services:
        envoy_grpc: { cluster_name: xds_service }

# This section tells envoy what servers are available to load balance to.  In the case of deephaven
# there is a single controller listening on port 8124 (or whatever you configure it to)

# This configuration assumes the web_api_service is running on the same host as Envoy.
# If envoy is instead running within a docker container, or another host the address should
# be updated.
static_resources:
  clusters:
    - name: xds_service
      connect_timeout: 0.25s
      type: static
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          '@type': type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_config:
            http2_protocol_options: {}
      load_assignment:
        cluster_name: 'xds_service'
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      # Update these values to match your configuration server and xds port
                      address: <enter-ip-here>
                      port_value: 8124

  # This section tells envoy to listen for incoming connections on port 8000 from anywhere
  # Upgrades them to websocket style connections and discover routes via a V3 GRPC interface
  listeners:
    - address:
        socket_address:
          # This address and port is the port to which clients will connect.
          # 0.0.0.0 indicates that Envoy should listen for -all- connections on port 8000
          # from any interface
          address: 0.0.0.0
          port_value: 8000
      filter_chains:
        - filters:
            - name: envoy.http_connection_manager
              typed_config:
                '@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                codec_type: AUTO
                stat_prefix: egress_http
                upgrade_configs:
                  - upgrade_type: websocket
                rds:
                  route_config_name: rds_config
                  config_source:
                    resource_api_version: V3
                    api_config_source:
                      api_type: GRPC
                      transport_api_version: V3
                      grpc_services:
                        envoy_grpc:
                          cluster_name: 'xds_service'
                access_log:
                  - name: envoy.file_access_log
                    typed_config:
                      '@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
                      path: '/tmp/envoy-rds.log'
                http_filters:
                  - name: envoy.filters.http.router
                    typed_config: {}
          transport_socket:
            name: envoy.transport_sockets.tls
            typed_config:
              '@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
              common_tls_context:
                tls_certificates:
                  - certificate_chain:
                      filename: '/lighttpd.pem'
                    private_key:
                      filename: '/lighttpd.pem'
admin:
  access_log_path: '/tmp/envoy.log'
  # Setting these will allow a sysadmin to dump Envoy configurations from localhost of the envoy
  # host.  We suggest you disable this in production, as the admin interface can be used to
  # modify the running configuration
  # address:
  #   socket_address:
  #     address: 127.0.0.1
  #     port_value: 8001

Most of the configuration in the example above can be left alone. However, there are a few places where you might need more specific configuration.

Static resources

Clusters

In the static_resources section, you must configure the target address and port on which the Configuration Service will be listening for xDS requests.

If the Configuration Service is running on a different node than Envoy, or if Envoy is running inside a Docker container (as per the recommended configuration), you will need to change this to the appropriate address. For example, on a host with address 10.128.0.123 running Envoy inside a docker container, the address should be set to 10.128.0.123. It might be more appropriate, or just easier, to use a DNS name. If this is desired, change the cluster type from STATIC to LOGICAL_DNS and then you can replace the address with a fully qualified domain name. The port must be set to match the Deephaven system property ‘envoy.xds.port‘, which is set to 8124 by default.

Listeners

The listeners section tells Envoy what addresses and ports it should listen on for forwarding. Deephaven will only use a single address and port for these forwarding requests. In the example configuration, the address 0.0.0.0 and port 8000 indicates that Envoy should listen for connections from ANY interface on port 8000. Users will be able to connect to Deephaven using the URL:

https://<<myFQDN>>:8000/iriside

You may wish to change this port to something else, or set the address to something more specific to constrain where connections are allowed from.

TLS certificates

When using TLS, Envoy needs to be aware of the TLS Certificate chain and private key file to be used to complete the TLS handshakes. This is configured by the common_tls_context stanza. The example configuration uses Deephaven’s default SSL truststore and private key for lighttpd which is installed at /etc/ssl/private. Administrators should consider installing their own trust store and key for Envoy.

Administration

Administrators may like to set up the Envoy administration port to be able to inspect the currently running Envoy configuration, which includes the currently configured routes and clusters. Be careful when this is enabled as it allows access to internal Envoy configuration. Deephaven recommends this be disabled in production configurations.

Installation via Docker

While you may install Envoy by compiling from source, or using a package manager such as yum or apt, Deephaven recommends deploying Envoy as a docker container. This makes configuration and updates simple since the Envoy project publishes pre-built Docker images.

First, ensure that Docker is installed and running on your host. Next, download the latest Envoy proxy image:

sudo docker pull envoyproxy/envoy:v1.20.1

Create an envoy yaml file. The example Envoy configuration file provided earlier is valid for a typical Deephaven cluster. This example assumes that the yaml file will be placed in /etc/sysconfig/illumon.d/resources, and is called envoy3.yaml to indicate it's an envoy version 3 file.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/resources/envoy3.yaml

As described in the Clusters section above, since we're running envoy in a Docker container, we need to change the address in the endpoint section of the xds_service cluster. We will use the IP since we don't expect it to change.

Next, create the docker container. This container will be named and reused each time envoy needs to start.

  • We'll use the yaml file we created above
  • We'll use the client update service's pemfile (it is recommended to create a different one for a production installation).
sudo /usr/bin/docker run -d \
   -p 8000:8000 \
   -p 8001:8001 \
   -v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml \
   -v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem:/lighttpd.pem \
   -u 9002 \
   --name deephaven_envoy \
   envoyproxy/envoy:v1.20.1 -c /config.yaml

This command configures a few things for the container:

  • -d -> Run the container in detached mode (as a background process).
  • -p 8000:8000 -> map internal port 8000 to host port 8000.
  • -p 8001:8001 -> map internal port 8001 to host port 8001 (you may remove this if you are not exposing the admin port).
  • -v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml -> Mount the host’s /etc/sysconfig/illumon.d/resources/envoy3.yaml file as /config.yaml within the container so that the Envoy instance can see it.
  • -v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem -> Mount the tls key file into the container.
  • -u 9002 -> Run as irisadmin. Note that the irisadmin user ID will likely be different on your system. You can use id -u irisadmin to double-check the user ID. If your system runs the core infrastructure processes such as the Persistent Query Controller as a different user, that should be used instead.
  • --name deephaven_envoy -> names the container.
  • envoyproxy/envoy:v1.20.1 -c /config.yaml -> Launch the envoy 1.20.1 container using the mounted config.yaml file as the config source.
  • --add-host=hostname.company.com:1.2.3.4 -> adds an entry to the docker container’s /etc/hosts file. This is useful when DNS resolution isn’t configured inside the docker image and you are using a host name. This example does not do this.

You can see running docker containers, including the Envoy container, with:

sudo docker container ls

If you have issues with the container not starting, run it interactively to see the output. Replace -d with -it.

You can delete the container (for example, if you need to recreate it), with the following command:

sudo docker rm deephaven_envoy

You can see the logs for the container with the following command. You can use the container ID or the name. The second example shows the logs on or after the specified date.

sudo docker logs deephaven_envoy
sudo docker logs deephaven_envoy --since 2021-11-18

The following command stops the container.

sudo docker container stop deephaven_envoy

Configuring Envoy as a system service

To run using systemd, create the file /etc/systemd/system/envoy.service.

sudo vi /etc/systemd/system/envoy.service

The following file contents assume that envoy is being run in a Docker container, as illustrated above. The command will be the /usr/bin/docker run command, but must not include sudo.

# /etc/systemd/system/envoy.service
[Unit]
Description=Envoy Proxy
Documentation=https://www.envoyproxy.io/
After=network-online.target
[Service]
User=root
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/docker restart deephaven_envoy
ExecStop=/usr/bin/docker container stop deephaven_envoy
[Install]
WantedBy=multi-user.target

After editing the file, reload the daemon process:

sudo systemctl daemon-reload

You should further configure Envoy to automatically start up at system startup:

sudo systemctl enable envoy

You can start Envoy with:

sudo systemctl start envoy

You can see the systemd logs with:

sudo systemctl status envoy.service

You can stop Envoy with:

sudo systemctl stop envoy

Deephaven configuration

Next, you must configure Deephaven services to interact with Envoy.

This requires a number of additions to Deephaven properties. Changes are needed to both the iris-environment and iris-endpoints properties files, as well as the getdown.global configuration file.

Note

There is no longer an option to enable Envoy only for the Deephaven Web UI. Once Envoy is enabled for Deephaven, all services must be configured to use Envoy, or the Configuration Server service will fail to start.

Update iris-environment.prop

The changes below are needed for the file iris-environment.prop. These are the commands to export, edit, and reimport this file:

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-environment.prop -d /tmp/

sudo -u irisadmin vi /tmp/iris-environment.prop

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-environment.prop -d /tmp/

To enable Envoy, set the following properties at the global scope (add them to the property file). Replace the url with your fully-qualified url.

global.websocket.server.enabled=true
envoy.front.proxy.url=your-user-facing-host.company.com:8000
envoy.terminate.ssl=true
  • The global.websocket.server.enabled property is used by a number of services to decide whether to start a websocket listener that Envoy will use. When set at the global scope, all services that are capable will be fronted by Envoy.
  • The envoy.front.proxy.url property indicates the host and port on which Envoy is listening for connections.
  • By default, the configuration service listens for Envoy xDS requests on port 8124. You can change this port by setting the envoy.xds.port property. Deephaven also assumes you have configured Envoy to listen for HTTP requests with SSL enabled (and strongly recommends this configuration, as the connection is used for authentication). If you must disable SSL, you can set the envoy.terminate.ssl property to false.

Envoy will close websockets after 5 minutes of inactivity (by default). Other proxies in the network path can also close the connections. In order to prevent this, Deephaven sends a heartbeat to keep each connection alive. You can change the heartbeat frequency with this property. This is not normally needed, but the following property is used if necessary.

envoy.front.proxy.keepalive.ping.ms=60000

The following properties need to be available to the Configuration Server, because it runs the Envoy discovery service and needs to know how to route traffic to all eligible services.

  • Webapi.server.port is already set and does not need to be changed.
  • Webapi.server.host is probably not set and needs to be added; add it to the configuration file we're already editing. Unless the envoy server is running on a different server than the web server, this will be the same url as was added to envoy.front.proxy.url but without the port. This can be a global property.
Webapi.server.host=your-web-api-host.company.com

Tell the Swing Console (and other client programs) to use Envoy. These properties should be added to the [service.name=iris_console|interactive_console] stanza in the property file.

    global.websocket.client.enabled=true
    WAuthenticationClientManager.defaultClientFactoryClass=com.illumon.iris.auth.WebsocketAuthenticationClient

Assign a websocket port to services that will be fronted by Envoy. These ports need to be unique per host. For example, you may use the same port for the Auth Server on each host where the service runs, but you must assign different ports to each Query Dispatcher on a given host. The ports suggested here will work with most default installations.

Many of these properties must be visible to the Configuration Server in order to serve the Envoy Discovery Service, as well as to the target services, so the stanza definitions below include both the configuration service and the target service. Add each of the properties to its stanza. If the stanzas don't exist then add the stanza and the property.

Make sure that the client_update_service.host and configuration.server.host properties are updated to the fully qualified host name for the server where those processes run.

# auth server
[service.name=authentication_server|configuration_server] {
    AuthenticationServer.websocket.port=22050
}

# controller
[service.name=iris_controller|configuration_server] {
    PersistentQueryController.websocket.port=22051
}

#dbacl writer
[service.name=db_acl_write_server|iris_db_user_mod|configuration_server] {
    dbaclwriter.websocket.port=22053
}

#client update service and configuration server
[service.name=configuration_server] {
    client_update_service.host=cus-node.company.com
    client_update_service.port=8443
    # Note: the configuration server itself does not yet support websocket communication
    configuration.server.host=config-server-node.company.com
    configuration.server.websocket.port=22054
}

Each Query Dispatcher that will be available via Envoy will need a websocket port. Each process on a given host needs a unique port. These services appear in stanzas like the following. Each place where RemoteQueryDispatcherParameters.queryPort is set, you will need to set RemoteQueryDispatcherParameters.websocket.port. Add the following properties (with stanzas) to the property file.

  • If you are running multiple dispatchers on a single node, each dispatcher will need its own port. This is typically the case for single-node installations, since both the query server and merge server run on the same node.
  • If you are running multiple query servers but each one is on a different server, each query server can use the same port. This is typically the case for a large-scale installation.
  • The example shown here uses different ports for the merge server and the query servers, and should work for both the above cases.
# set the default websocket port for query servers
RemoteQueryDispatcherParameters.websocket.port=22052

# set a different default websocket for all merge servers
[service.name=dbmerge|db_dis_merge|tailer1_merge] {
    RemoteQueryDispatcherParameters.websocket.port=22060
}

Update iris-endpoints.prop

Services that need to know about all the dispatchers at once (specifically the Persistent Query Controller) use a different mechanism than the above stanzas; all the information is in that process's stanza (see https://deephaven.io/enterprise/docs/sys-admin/pq-controller/pq-controller/ for further details). The Configuration Service now needs the same information, and the websocket port needs to be added.

There will be a section like the following in iris-endpoints.prop which defines all the remote query dispatchers available to the controller. The RemoteQueryDispatcherParameters.websocket.port property active in a given scope will be the default value for all iris.db.n.websocket.port settings. Only those that are different need to be specified. This usually means that the merge servers will need this property, but the query servers will not.

The changes below are needed for the file iris-endpoints.prop. These are the commands to export, edit, and reimport this file:

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-endpoints.prop -d /tmp/

sudo -u irisadmin vi /tmp/iris-endpoints.prop

sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-endpoints.prop -d /tmp/

In iris-endpoints.prop, add configuration_server to the stanza that includes iris_controller and controller_tool. For merge servers add their websocket port to their iris.db properties. The following example shows a typical configuration for a three-node cluster:

[service.name=iris_controller|controller_tool|configuration_server] {
iris.db.nservers=3

    iris.db.1.host=query-1.company.com
    iris.db.1.classPushList=
    iris.db.1.class=Query

    iris.db.2.host=query-2.company.com
    iris.db.2.classPushList=
    iris.db.2.class=Query

    iris.db.3.host=infra-1.company.com
    iris.db.3.classPushList=
    iris.db.3.port=30002
    iris.db.3.class=Merge
    iris.db.3.websocket.port=22060
}

Note

The iris-endpoints.prop file is regenerated when the Deephaven installer is run - e.g. when adding nodes to the cluster, or upgrading to a newer Deephaven version. The changes above to this file must be reapplied manually after such a reconfiguration or upgrade.

Set up the client update service to tell swing clients to use Envoy

  • Clients cannot yet get properties files via envoy. Add the following line to /etc/sysconfig/illumon.d/client_update_service/getdown.global so that the Swing client will read properties files from disk. This ensures that all users who log in to this instance will be able to correctly read property files without making additional changes. jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional
  • Update the existing appbase line to have the appropriate port and address, typically ending in :8000/iris/
  • Update the existing ui.install_error line to have the appropriate port and address, typically ending in :8000/cus/error.html

Edit getdown.global. Note that this file is not accessed by the configuration service so it does not need to be exported and imported.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.global

For example:

jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional

appbase = https://<server name>:8000/iris/

ui.install_error = https://<server name>:8000/cus/error.html

Edit getdown.port and change it to port 8000.

sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.port

Restart all services on all nodes.

sudo -u irisadmin monit stop all
sudo -u irisadmin monit start all

Possible additional settings

There are several places where message sizes can become larger than the default configured maximum for a websocket. If this happens, you will get a message in the process's log file that looks like the following. The message indicates the current maximum and the actual larger size.

CloseReason[1009,Binary message size [338135] exceeds maximum size [65536]]

You can work around this by increasing the limit using the property websocket.client.max.message.size. This applies to all client Envoy connections and will become the new default for client connections. It applies to all web servers expecting Envoy connections and to any other clients such as swing consoles. It's usually best to apply this at the global level rather than within a stanza. If you append a serviceId, then the new limit applies only to that service.

websocket.client.max.message.size=100000000

The server message size defaults to the client size if not set, but can be changed individually if needed. Like websocket.client.max.message.size, a serviceId can be appended to limit the value to that service.

websocket.server.max.message.size=100000000

Extra Envoy XDS routes

You may also add additional routes to the Envoy configuration to expose other services through the proxy.

For each route you would like to add, you must create a set of properties that define the extra route. These properties must be visible to the configuration server (either global or in a stanza visible to the configuration server). After making changes, configuration_server must be restarted to make the changes visible.

This example route is named dis and will redirect requests for https://envoy-server.your.company.com:<envoy_port>/dis/anything to http://infra-1.your.company.com:8086/anything:

    envoy.xds.extra.routes.dis.host=infra-1.your.company.com
    envoy.xds.extra.routes.dis.port=8086
    envoy.xds.extra.routes.dis.prefix=/dis/
    envoy.xds.extra.routes.dis.prefixRewrite=/
    envoy.xds.extra.routes.dis.tls=false
    envoy.xds.extra.routes.dis.exactPrefix=false

The properties defining an Envoy route are:

  • envoy.xds.extra.routes.<name>.host
    • Required. The name or address the Envoy service needs to reach the destination service. This could be different from the host you would use from outside the network.
  • envoy.xds.extra.routes.<name>.port
    • Optional. The port of the destination service. Defaults to 443 when TLS (Transport Layer Security) is enabled and 80 otherwise.
  • envoy.xds.extra.routes.<name>.prefix
    • Required. Paths beginning with this prefix will be routed to the service at host:port. Do not use "/", "/worker", "/cus", "/comm", "/iris", or other prefixes that create conflicts or ambiguity with Deephaven mappings.
  • envoy.xds.extra.routes.<name>.prefixRewrite
    • Optional. Defaults to /. This value is used to change the prefix from the requested URL before forwarding to the destination service. In the example above, a user would route to the DIS's internal server by requesting a URL such as https://envoy-server.your.company.com:8000/dis/config. This server doesn't know about the /dis part of the path, and envoy will strip the prefix and send http://infra-1-server.your.company.com:8086/config to the destination. (note the changed path and port)
  • envoy.xds.extra.routes.<name>.tls
    • Optional. Indicates whether the destination service uses TLS/HTTPS. This is independent of whether envoy is using TLS. Defaults to true.
  • envoy.xds.extra.routes.<name>.exactPrefix
    • Optional. If true, then the prefix and prefixRewrite values will be used exactly as specified in the envoy route configuration. If false (the default), extra routes will be added both with and without a terminating /. We have found that this produces the desired results with rare exceptions.

Appendix

Tested versions of Envoy

Deephaven has tested compatibility with Envoy version 1.19.0 and 1.20.1. Only 1.20.1 is certified with current versions of Deephaven.