Using Envoy as a front proxy
The Deephaven Open API and Web interface require connecting to the web_api_service and Deephaven query workers. The Swing Console requires direct connections to multiple Deephaven services in addition to the query workers.
This architecture allows Deephaven to scale to many users and machines without introducing a single process to handle all of the network traffic. However, this can be inconvenient from a network management perspective because API clients must be permitted to connect to many different hosts and ports on the query cluster.
To simplify network management, a front proxy can be exposed to client machines and route the traffic to the Web API Service, query workers, and other services. Deephaven uses Envoy (http://envoyproxy.io) as a front proxy. Envoy is a scalable open-source network proxy originally developed by Lyft which supports dynamic configuration using gRPC calls.
Since Envoy routes all inbound and outbound network traffic through the proxy, this may impact performance, particularly on high-throughput systems.
To configure Envoy for use with Deephaven, you must select a host for installation. Appropriate installation directions for your environment can be found here.
Configuration
In order for Envoy to act as a proxy for Deephaven services it must be configured to listen for incoming traffic on a specific address and port. It then must be configured to use the Deephaven Configuration Service as a Discovery Service (called an xDS) for routes to individual Persistent Queries, Query Workers, and other services. Specifically, Deephaven exports a Cluster Discovery Service (CDS) and Route Discovery Service (RDS) to Envoy.
The Deephaven Services and each Deephaven worker are defined as clusters in your dynamic Envoy configuration. Deephaven’s RDS creates routing rules that map various paths to specific workers and other services. Any path that does not match a worker or service prefix is directed to the Web API service.
Example Envoy Configuration File
An example Envoy YAML configuration file suitable for use with Deephaven follows. Instructions are provided later where this file should be located, and how to make it available to Envoy.
node: { id: 'envoynode', cluster: 'envoycluster' }
# This section tells Envoy that there is a dynamic cluster discovery service 'xds_service
# that is communicating via GRPC using the V3 API and data structures.
dynamic_resources:
cds_config:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
envoy_grpc: { cluster_name: xds_service }
# This section tells envoy what servers are available to load balance to. In the case of deephaven
# there is a single controller listening on port 8124 (or whatever you configure it to)
# This configuration assumes the web_api_service is running on the same host as Envoy.
# If envoy is instead running within a docker container, or another host the address should
# be updated.
static_resources:
clusters:
- name: xds_service
connect_timeout: 0.25s
type: static
typed_extension_protocol_options:
envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
'@type': type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
explicit_http_config:
http2_protocol_options: {}
load_assignment:
cluster_name: 'xds_service'
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
# Update these values to match your configuration server and xds port
address: <enter-ip-here>
port_value: 8124
# This section tells envoy to listen for incoming connections on port 8000 from anywhere
# Upgrades them to websocket style connections and discover routes via a V3 GRPC interface
listeners:
- address:
socket_address:
# This address and port is the port to which clients will connect.
# 0.0.0.0 indicates that Envoy should listen for -all- connections on port 8000
# from any interface
address: 0.0.0.0
port_value: 8000
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
'@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
codec_type: AUTO
stat_prefix: egress_http
upgrade_configs:
- upgrade_type: websocket
rds:
route_config_name: rds_config
config_source:
resource_api_version: V3
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
envoy_grpc:
cluster_name: 'xds_service'
access_log:
- name: envoy.file_access_log
typed_config:
'@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
path: '/tmp/envoy-rds.log'
http_filters:
- name: envoy.filters.http.router
typed_config: {}
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
'@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain:
filename: '/lighttpd.pem'
private_key:
filename: '/lighttpd.pem'
admin:
access_log_path: '/tmp/envoy.log'
# Setting these will allow a sysadmin to dump Envoy configurations from localhost of the envoy
# host. We suggest you disable this in production, as the admin interface can be used to
# modify the running configuration
# address:
# socket_address:
# address: 127.0.0.1
# port_value: 8001
Most of the configuration in the example above can be left alone. However, there are a few places where you might need more specific configuration.
Static resources
Clusters
In the static_resources
section, you must configure the target address and port on which the Configuration Service will be listening for xDS requests.
If the Configuration Service is running on a different node than Envoy, or if Envoy is running inside a Docker container (as per the recommended configuration), you will need to change this to the appropriate address. For example, on a host with address 10.128.0.123 running Envoy inside a docker container, the address should be set to 10.128.0.123. It might be more appropriate, or just easier, to use a DNS name. If this is desired, change the cluster type from STATIC to LOGICAL_DNS and then you can replace the address with a fully qualified domain name. The port must be set to match the Deephaven system property ‘envoy.xds.port‘, which is set to 8124 by default.
Listeners
The listeners
section tells Envoy what addresses and ports it should listen on for forwarding. Deephaven will only use a single address and port for these forwarding requests. In the example configuration, the address 0.0.0.0 and port 8000 indicates that Envoy should listen for connections from ANY interface on port 8000. Users will be able to connect to Deephaven using the URL:
https://<<myFQDN>>:8000/iriside
You may wish to change this port to something else, or set the address to something more specific to constrain where connections are allowed from.
TLS certificates
When using TLS, Envoy needs to be aware of the TLS Certificate chain and private key file to be used to complete the TLS handshakes. This is configured by the common_tls_context
stanza. The example configuration uses Deephaven’s default SSL truststore and private key for lighttpd
which is installed at /etc/ssl/private
. Administrators should consider installing their own trust store and key for Envoy.
Administration
Administrators may like to set up the Envoy administration port to be able to inspect the currently running Envoy configuration, which includes the currently configured routes and clusters. Be careful when this is enabled as it allows access to internal Envoy configuration. Deephaven recommends this be disabled in production configurations.
Installation via Docker
While you may install Envoy by compiling from source, or using a package manager such as yum
or apt
, Deephaven recommends deploying Envoy as a docker container. This makes configuration and updates simple since the Envoy project publishes pre-built Docker images.
First, ensure that Docker is installed and running on your host. Next, download the latest Envoy proxy image:
sudo docker pull envoyproxy/envoy:v1.20.1
Create an envoy yaml file. The example Envoy configuration file provided earlier is valid for a typical Deephaven cluster. This example assumes that the yaml file will be placed in /etc/sysconfig/illumon.d/resources, and is called envoy3.yaml to indicate it's an envoy version 3 file.
sudo -u irisadmin vi /etc/sysconfig/illumon.d/resources/envoy3.yaml
As described in the Clusters section above, since we're running envoy in a Docker container, we need to change the address in the endpoint section of the xds_service cluster. We will use the IP since we don't expect it to change.
Next, create the docker container. This container will be named and reused each time envoy needs to start.
- We'll use the yaml file we created above
- We'll use the client update service's pemfile (it is recommended to create a different one for a production installation).
sudo /usr/bin/docker run -d \
-p 8000:8000 \
-p 8001:8001 \
-v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml \
-v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem:/lighttpd.pem \
-u 9002 \
--name deephaven_envoy \
envoyproxy/envoy:v1.20.1 -c /config.yaml
This command configures a few things for the container:
-d
-> Run the container in detached mode (as a background process).-p 8000:8000
-> map internal port 8000 to host port 8000.-p 8001:8001
-> map internal port 8001 to host port 8001 (you may remove this if you are not exposing the admin port).-v /etc/sysconfig/illumon.d/resources/envoy3.yaml:/config.yaml
-> Mount the host’s /etc/sysconfig/illumon.d/resources/envoy3.yaml file as /config.yaml within the container so that the Envoy instance can see it.-v /etc/sysconfig/illumon.d/client_update_service/lighttpd.pem
-> Mount the tls key file into the container.-u 9002
-> Run as irisadmin. Note that theirisadmin
user ID will likely be different on your system. You can useid -u irisadmin
to double-check the user ID. If your system runs the core infrastructure processes such as the Persistent Query Controller as a different user, that should be used instead.--name deephaven_envoy
-> names the container.envoyproxy/envoy:v1.20.1 -c /config.yaml
-> Launch the envoy 1.20.1 container using the mounted config.yaml file as the config source.--add-host=hostname.company.com:1.2.3.4
-> adds an entry to the docker container’s /etc/hosts file. This is useful when DNS resolution isn’t configured inside the docker image and you are using a host name. This example does not do this.
You can see running docker containers, including the Envoy container, with:
sudo docker container ls
If you have issues with the container not starting, run it interactively to see the output. Replace -d
with -it
.
You can delete the container (for example, if you need to recreate it), with the following command:
sudo docker rm deephaven_envoy
You can see the logs for the container with the following command. You can use the container ID or the name. The second example shows the logs on or after the specified date.
sudo docker logs deephaven_envoy
sudo docker logs deephaven_envoy --since 2021-11-18
The following command stops the container.
sudo docker container stop deephaven_envoy
Configuring Envoy as a system service
To run using systemd
, create the file /etc/systemd/system/envoy.service
.
sudo vi /etc/systemd/system/envoy.service
The following file contents assume that envoy is being run in a Docker container, as illustrated above. The command will be the /usr/bin/docker run
command, but must not include sudo
.
# /etc/systemd/system/envoy.service
[Unit]
Description=Envoy Proxy
Documentation=https://www.envoyproxy.io/
After=network-online.target
[Service]
User=root
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/docker restart deephaven_envoy
ExecStop=/usr/bin/docker container stop deephaven_envoy
[Install]
WantedBy=multi-user.target
After editing the file, reload the daemon process:
sudo systemctl daemon-reload
You should further configure Envoy to automatically start up at system startup:
sudo systemctl enable envoy
You can start Envoy with:
sudo systemctl start envoy
You can see the systemd logs with:
sudo systemctl status envoy.service
You can stop Envoy with:
sudo systemctl stop envoy
Deephaven configuration
Next, you must configure Deephaven services to interact with Envoy.
This requires a number of additions to Deephaven properties. Changes are needed to both the iris-environment
and iris-endpoints
properties files, as well as the getdown.global
configuration file.
Note
There is no longer an option to enable Envoy only for the Deephaven Web UI. Once Envoy is enabled for Deephaven, all services must be configured to use Envoy, or the Configuration Server service will fail to start.
Update iris-environment.prop
The changes below are needed for the file iris-environment.prop
. These are the commands to export, edit, and reimport this file:
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-environment.prop -d /tmp/
sudo -u irisadmin vi /tmp/iris-environment.prop
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-environment.prop -d /tmp/
To enable Envoy, set the following properties at the global scope (add them to the property file). Replace the url with your fully-qualified url.
global.websocket.server.enabled=true
envoy.front.proxy.url=your-user-facing-host.company.com:8000
envoy.terminate.ssl=true
- The
global.websocket.server.enabled
property is used by a number of services to decide whether to start a websocket listener that Envoy will use. When set at the global scope, all services that are capable will be fronted by Envoy. - The
envoy.front.proxy.url
property indicates the host and port on which Envoy is listening for connections. - By default, the configuration service listens for Envoy xDS requests on port 8124. You can change this port by setting the
envoy.xds.port
property. Deephaven also assumes you have configured Envoy to listen for HTTP requests with SSL enabled (and strongly recommends this configuration, as the connection is used for authentication). If you must disable SSL, you can set theenvoy.terminate.ssl
property to false.
Envoy will close websockets after 5 minutes of inactivity (by default). Other proxies in the network path can also close the connections. In order to prevent this, Deephaven sends a heartbeat to keep each connection alive. You can change the heartbeat frequency with this property. This is not normally needed, but the following property is used if necessary.
envoy.front.proxy.keepalive.ping.ms=60000
The following properties need to be available to the Configuration Server, because it runs the Envoy discovery service and needs to know how to route traffic to all eligible services.
Webapi.server.port
is already set and does not need to be changed.Webapi.server.host
is probably not set and needs to be added; add it to the configuration file we're already editing. Unless the envoy server is running on a different server than the web server, this will be the same url as was added toenvoy.front.proxy.url
but without the port. This can be a global property.
Webapi.server.host=your-web-api-host.company.com
Tell the Swing Console (and other client programs) to use Envoy. These properties should be added to the [service.name=iris_console|interactive_console]
stanza in the property file.
global.websocket.client.enabled=true
WAuthenticationClientManager.defaultClientFactoryClass=com.illumon.iris.auth.WebsocketAuthenticationClient
Assign a websocket port to services that will be fronted by Envoy. These ports need to be unique per host. For example, you may use the same port for the Auth Server on each host where the service runs, but you must assign different ports to each Query Dispatcher on a given host. The ports suggested here will work with most default installations.
Many of these properties must be visible to the Configuration Server in order to serve the Envoy Discovery Service, as well as to the target services, so the stanza definitions below include both the configuration service and the target service. Add each of the properties to its stanza. If the stanzas don't exist then add the stanza and the property.
Make sure that the client_update_service.host
and configuration.server.host
properties are updated to the fully qualified host name for the server where those processes run.
# auth server
[service.name=authentication_server|configuration_server] {
AuthenticationServer.websocket.port=22050
}
# controller
[service.name=iris_controller|configuration_server] {
PersistentQueryController.websocket.port=22051
}
#dbacl writer
[service.name=db_acl_write_server|iris_db_user_mod|configuration_server] {
dbaclwriter.websocket.port=22053
}
#client update service and configuration server
[service.name=configuration_server] {
client_update_service.host=cus-node.company.com
client_update_service.port=8443
# Note: the configuration server itself does not yet support websocket communication
configuration.server.host=config-server-node.company.com
configuration.server.websocket.port=22054
}
Each Query Dispatcher that will be available via Envoy will need a websocket port. Each process on a given host needs a unique port. These services appear in stanzas like the following. Each place where RemoteQueryDispatcherParameters.queryPort
is set, you will need to set RemoteQueryDispatcherParameters.websocket.port
. Add the following properties (with stanzas) to the property file.
- If you are running multiple dispatchers on a single node, each dispatcher will need its own port. This is typically the case for single-node installations, since both the query server and merge server run on the same node.
- If you are running multiple query servers but each one is on a different server, each query server can use the same port. This is typically the case for a large-scale installation.
- The example shown here uses different ports for the merge server and the query servers, and should work for both the above cases.
# set the default websocket port for query servers
RemoteQueryDispatcherParameters.websocket.port=22052
# set a different default websocket for all merge servers
[service.name=dbmerge|db_dis_merge|tailer1_merge] {
RemoteQueryDispatcherParameters.websocket.port=22060
}
Update iris-endpoints.prop
Services that need to know about all the dispatchers at once (specifically the Persistent Query Controller) use a different mechanism than the above stanzas; all the information is in that process's stanza (see https://deephaven.io/enterprise/docs/sys-admin/pq-controller/pq-controller/ for further details). The Configuration Service now needs the same information, and the websocket port needs to be added.
There will be a section like the following in iris-endpoints.prop
which defines all the remote query dispatchers available to the controller. The RemoteQueryDispatcherParameters.websocket.port
property active in a given scope will be the default value for all iris.db.n.websocket.port
settings. Only those that are different need to be specified. This usually means that the merge servers will need this property, but the query servers will not.
The changes below are needed for the file iris-endpoints.prop
. These are the commands to export, edit, and reimport this file:
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties export --etcd -f iris-endpoints.prop -d /tmp/
sudo -u irisadmin vi /tmp/iris-endpoints.prop
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig properties import --etcd -f iris-endpoints.prop -d /tmp/
In iris-endpoints.prop
, add configuration_server to the stanza that includes iris_controller and controller_tool. For merge servers add their websocket port to their iris.db properties. The following example shows a typical configuration for a three-node cluster:
[service.name=iris_controller|controller_tool|configuration_server] {
iris.db.nservers=3
iris.db.1.host=query-1.company.com
iris.db.1.classPushList=
iris.db.1.class=Query
iris.db.2.host=query-2.company.com
iris.db.2.classPushList=
iris.db.2.class=Query
iris.db.3.host=infra-1.company.com
iris.db.3.classPushList=
iris.db.3.port=30002
iris.db.3.class=Merge
iris.db.3.websocket.port=22060
}
Note
The iris-endpoints.prop file is regenerated when the Deephaven installer is run - e.g. when adding nodes to the cluster, or upgrading to a newer Deephaven version. The changes above to this file must be reapplied manually after such a reconfiguration or upgrade.
Set up the client update service to tell swing clients to use Envoy
- Clients cannot yet get properties files via envoy. Add the following line to
/etc/sysconfig/illumon.d/client_update_service/getdown.global
so that the Swing client will read properties files from disk. This ensures that all users who log in to this instance will be able to correctly read property files without making additional changes.jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional
- Update the existing appbase line to have the appropriate port and address, typically ending in
:8000/iris/
- Update the existing ui.install_error line to have the appropriate port and address, typically ending in
:8000/cus/error.html
Edit getdown.global
. Note that this file is not accessed by the configuration service so it does not need to be exported and imported.
sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.global
For example:
jvmarg = -Dcom.fishlib.configuration.PropertyInputStreamLoader.override=com.fishlib.configuration.PropertyInputStreamLoaderTraditional
appbase = https://<server name>:8000/iris/
ui.install_error = https://<server name>:8000/cus/error.html
Edit getdown.port
and change it to port 8000.
sudo -u irisadmin vi /etc/sysconfig/illumon.d/client_update_service/getdown.port
Restart all services on all nodes.
sudo -u irisadmin monit stop all
sudo -u irisadmin monit start all
Possible additional settings
There are several places where message sizes can become larger than the default configured maximum for a websocket. If this happens, you will get a message in the process's log file that looks like the following. The message indicates the current maximum and the actual larger size.
CloseReason[1009,Binary message size [338135] exceeds maximum size [65536]]
You can work around this by increasing the limit using the property websocket.client.max.message.size
. This applies to all client Envoy connections and will become the new default for client connections. It applies to all web servers expecting Envoy connections and to any other clients such as swing consoles. It's usually best to apply this at the global level rather than within a stanza. If you append a serviceId, then the new limit applies only to that service.
websocket.client.max.message.size=100000000
The server message size defaults to the client size if not set, but can be changed individually if needed. Like websocket.client.max.message.size
, a serviceId can be appended to limit the value to that service.
websocket.server.max.message.size=100000000
Extra Envoy XDS routes
You may also add additional routes to the Envoy configuration to expose other services through the proxy.
For each route you would like to add, you must create a set of properties that define the extra route. These properties must be visible to the configuration server (either global or in a stanza visible to the configuration server). After making changes, configuration_server
must be restarted to make the changes visible.
This example route is named dis
and will redirect requests for https://envoy-server.your.company.com:<envoy_port>/dis/anything
to http://infra-1.your.company.com:8086/anything
:
envoy.xds.extra.routes.dis.host=infra-1.your.company.com
envoy.xds.extra.routes.dis.port=8086
envoy.xds.extra.routes.dis.prefix=/dis/
envoy.xds.extra.routes.dis.prefixRewrite=/
envoy.xds.extra.routes.dis.tls=false
envoy.xds.extra.routes.dis.exactPrefix=false
The properties defining an Envoy route are:
envoy.xds.extra.routes.<name>.host
- Required. The name or address the Envoy service needs to reach the destination service. This could be different from the host you would use from outside the network.
envoy.xds.extra.routes.<name>.port
- Optional. The port of the destination service. Defaults to
443
when TLS (Transport Layer Security) is enabled and80
otherwise.
- Optional. The port of the destination service. Defaults to
envoy.xds.extra.routes.<name>.prefix
- Required. Paths beginning with this prefix will be routed to the service at host:port. Do not use "/", "/worker", "/cus", "/comm", "/iris", or other prefixes that create conflicts or ambiguity with Deephaven mappings.
envoy.xds.extra.routes.<name>.prefixRewrite
- Optional. Defaults to
/
. This value is used to change the prefix from the requested URL before forwarding to the destination service. In the example above, a user would route to the DIS's internal server by requesting a URL such ashttps://envoy-server.your.company.com:8000/dis/config
. This server doesn't know about the/dis
part of the path, and envoy will strip the prefix and sendhttp://infra-1-server.your.company.com:8086/config
to the destination. (note the changed path and port)
- Optional. Defaults to
envoy.xds.extra.routes.<name>.tls
- Optional. Indicates whether the destination service uses TLS/HTTPS. This is independent of whether envoy is using TLS. Defaults to true.
envoy.xds.extra.routes.<name>.exactPrefix
- Optional. If true, then the
prefix
andprefixRewrite
values will be used exactly as specified in the envoy route configuration. If false (the default), extra routes will be added both with and without a terminating/
. We have found that this produces the desired results with rare exceptions.
- Optional. If true, then the
Appendix
Tested versions of Envoy
Deephaven has tested compatibility with Envoy version 1.19.0 and 1.20.1. Only 1.20.1 is certified with current versions of Deephaven.