Scaling to multiple servers

Deephaven is designed with horizontal scaling in mind. This means adding capacity for services typically involves adding additional compute resources in the form of servers or virtual machines.

When considering how to scale Deephaven, the types of capacity needed should be determined before adding resources.

Scaling capacities

  1. Read capacity
  • Queries requiring access to very large data sets might require more IO bandwidth or substantial SAN caching.
  • Add more Query Dispatcher processes with added IO bandwidth.
  1. Write capacity
  • Handling multiple near real-time data streams quickly (through the Deephaven binary log/Tailer/DIS architecture).
  • Expanding IO for batch data imports or merges.
  • Large batch imports of historical data.
  1. Query capacity (Memory and CPU)
  • Queries may take large amounts of memory to handle large amounts of data, a lot of CPU to handle intensive manipulation of that data, or both.
  • Support for more end-user queries.
  • Support for analysis of large datasets that may consume smaller server's resources.

Server types

  1. Query servers
  • Increase read and analysis capacity of the database for end-users.
  • Large cpu and memory servers.
  • Run the Query Dispatcher and the spawned worker processes.
  1. Real-time import servers
  • Increase write capacity for importing near real-time data.
  • Increase write capacity for merging near real-time data to historical partitions.
  • Run the DIS and (optional) Local Table Data Service (LTDS) processes.
  1. Batch import servers
  • Increase write capacity for batch data imports to intraday partitions.
  • Increase write capacity for merging batch data to historical partitions.
  • Separate the batch data import function away from the near real-time data imports.
  1. Administrative Servers
  • Run administrative Deephaven processes, such as the Deephaven controller, the authentication server, and the ACL write server
  • May have additional requirements, such as access to MySQL.

A single host may run multiple server types. For example, the administrative processes do not require many resources, and consequently may be co-located with other services.

Single Server Architecture

The Deephaven Installation guide illustrates the provisioning of a server with a network interface and the loopback interface (network layer), local disk storage (storage layer), and the Deephaven software (application layer) to create a single server Deephaven deployment. The guide then used this single server to perform basic operations by connecting to Deephaven from a remote client.

With that in mind, this single-server installation looks like the following:

A diagram of the data pipeline in a single-server Deephaven installation

This fully functional installation of Deephaven is limited to the compute resources of the underlying server. As the data import or query capacity requirements of the system grow beyond the server's capabilities, it is necessary to distribute the Deephaven processes across several servers, converting to a multiple server architecture.

Multiple server architecture

The multiple server architecture is built by moving the network and storage layers outside the server, and by adding hardware to run additional data loading or querying processes. The network layer moves to subnets or vlans. The storage layer becomes a shared filesystem that is mounted across all servers. The application layer becomes a set of servers with the Deephaven software installed and configured to perform certain tasks.

A diagram of the data pipeline in a multi-server Deephaven installation

Deploying Deephaven on multiple servers has the following requirements:

  1. Modification of each layer in the three-layer model.
  2. Modification and management of the Deephaven configuration.
  3. Modification and management of the dh_monit Deephaven configuration.

The network layer

The network and necessary services (DNS, NTP, etc.) is supplied by the customer. Therefore, this document will not cover configuration or implementation of the network except to specify the network services on which Deephaven depends.

Deephaven requires a subnet or VLAN for the servers to communicate. Like all big data deployments, fast network access to the data will benefit Deephaven in query and analysis speed. If using FQDNs in the Deephaven configuration, a robust DNS service is also recommended.

It is recommended that the network layer provide the fastest possible access to the storage layer.

The storage layer

The storage layer has the most changes in a multiple server installation. Deephaven requires access to a shared filesystem for historical and intraday data that can be mounted by each server in the Deephaven cluster.

Typically, these disk mounts are provided via the NFS protocol exported from a highly available storage system. Other types of distributed or clustered filesystems such as glusterFS or HDFS should work but have not been extensively tested.

As mentioned, Deephaven relies on a shared filesystem architecture for access to its data. Currently, the larger installations use ZFS filers to manage the storage mounts that are exported to the query and data import servers via NFS over 10g network interfaces.

Deephaven divides data into two categories:

  • Intraday data is any data that hasn't been merged to historical storage partitions, including near real-time as well as batch-imported data. Depending on the data size and rate, the underlying storage should typically be high-throughput (SSD) or contain sufficient caching layers to allow fast writes. Intraday data is written to the database via the Data Import Server or other import processes onto disks mounted on /db/Intraday/<namespace>/.
  • Historical data is populated by performing a merge of the Intraday data to the historical file systems. As storage needs grow, further storage can be easily added in the form of writable partitions without the need to reorganize existing directory or file structures, as Deephaven queries will automatically search additional historical partitions as they are added. Intraday data is typically merged to Historical data on the Data Import Servers during off hours by scheduled processes. The historical disk volumes are mounted into two locations on the servers:
    1. For writing: /db/Systems/<databaseNamespace>/WritablePartitions/[0..N]
    2. For reading: /db/Systems/<databaseNamespace>/Partitions/[0..N]

The Deephaven application layer

The software installation procedure for each server is documented in the Deephaven installation guide. Once the server is deployed, the network storage should be mounted on all the servers. Once this is done, the main configuration of Deephaven must now be managed and the dh_monit configuration modified to only run processes for each server type.

Users and groups

Deephaven requires three unix users and five unix groups that are created at install time by the installer. See Appendix B: Users and Groups in the installation guide for details.

Deephaven server types

In the server types section we described four server types: Real-time Data Import Server, Batch Data Import Server, Query Server and Administrative Server. Each of these server types is defined by the Deephaven processes they run and the hardware resources required for each. You can easily combine server types into a single server to provide multiple sets of functionality, given sufficient hardware resources. For instance, it is common to have the administrative server combined with a Data Import Server.

Example server types and associated resources are outlined below.

Administrative server

Server Hardware and Operating System

  • 6 cores (2 core minimum) @ 3.0 Ghz Intel
  • 10GB RAM
  • Linux - RedHat 7 or derivative
  • Network: 1g
  • Storage:
    • Root - sufficient local disk space

Processes

  • MySQL - ACL database
  • authentication_server - Securely authenticates users on login
  • db_acl_write_server - Serves as a single writer for ACL information stored within the ACL database
  • iris_controller - Specialized client that manages persistent query lifecycles, and provides discovery services to other clients for persistent queries
Real-time Data Import Server

Server Hardware and Operating System

  • 24 cores (6 cores minimum) @ 3.0 Ghz Intel
  • 256GB RAM (32GB minimum)
  • Linux - RedHat 7 or derivative
  • Network: 10g mtu 8192
  • Storage:
    • Root - sufficient local disk space
    • Database mount - 250GB SSD mirror local
    • Intraday data - SSD either local or iScsi
    • Historical data - NFS mounts for all historical data

Processes

  • DIS - receives binary table data from user processes and writes it to user namespaces, simultaneously serving read requests for same
  • tailer1 - reads data from binary log files written by Deephaven processes, and sends them to a DIS process for ingestion. Tailer configuration is covered in Deephaven Data Tailer.
Batch import server

Server Hardware and Operating System

  • 24 cores (6 cores minimum) @ 3.0 Ghz Intel
  • 256GB RAM (32GB minimum)
  • Linux - RedHat 7 or derivative
  • Network: 10g mtu 8192
  • Storage:
    • Root - sufficient local disk space
    • Database mount - 250GB SSD mirror local
    • Intraday data - SSD either local or iScsi
    • Historical data - NFS mounts for all historical data

Processes

  • Batch import processes run via cron or manually to load intraday data from external data sources such as CSV files
  • Validation and merge processes run via cron or manually to validate intraday data and add it to the historical database
  • Scripts to delete intraday data after it has been successfully merged
Query server

Server Hardware and Operating System

  • 24 cores (6 cores minimum) @ 3.0 Ghz Intel
  • 512GB RAM (32GB minimum)
  • Linux - RedHat 7 or derivative
  • Network: 10g mtu 8192
  • Storage:
    • Root - sufficient local disk space
    • Database mount - 250GB SSD mirror local
    • Historical data - 16 NFS mounts @ 35TB each

Processes

  • db_query_server or db_merge_server - manages query requests from clients and forks query worker processes to do the actual work of queries.