monit runbook
monit is a third-party process supervision tool used to manage Deephaven services on traditional installations and Podman deployments. It monitors processes, automatically restarts failed services, and provides a control interface for starting, stopping, and restarting Deephaven processes. monit is not used in Kubernetes deployments, where pod management is handled by Kubernetes itself.
Impact of monit failure
| Level | Impact |
|---|---|
| Sev 2 - Moderate | Process supervision and automatic restart capabilities are lost. Services continue running but will not automatically recover from failures. Manual intervention required to start or restart processes. |
Note
monit failure does not stop running Deephaven processes. They continue functioning until manually stopped or until they fail, at which point they will not automatically restart.
monit dependencies
monit has no dependencies on other services. It is one of the first processes to start on a Deephaven host.
monit manages:
- All Deephaven Java processes (Configuration Server, Authentication Server, etc.).
- etcd (if installed as part of Deephaven).
- Other custom services defined in monit configuration.
Checking monit status
Check monit service is running:
Expected output should show active (running).
View summary of all monitored processes:
View detailed status:
Check specific process:
Viewing monit logs
View monit log:
Tail the log to follow in real-time:
View systemd journal for monit:
Restart procedure
Restart monit:
Caution
Restarting monit does NOT restart monitored processes. They continue running but are temporarily unmonitored during the restart.
Verify the restart was successful:
Check all monitored processes are still tracked:
Reload monit configuration
After modifying monit configuration files:
This reloads configuration without restarting monit or any monitored processes.
Managing processes with monit
Start a process
Stop a process
Restart a process
Start all processes
Stop monitoring a process (without stopping it)
Resume monitoring
monit configuration
monit configuration consists of:
Main configuration: /etc/monitrc (overridable via DH_MONIT_RC environment variable)
Deephaven-specific configuration: /etc/sysconfig/illumon.d/monit/
Main monit configuration
Key settings in main configuration file:
Process configuration files
Each Deephaven process has its own configuration file:
Example: /etc/sysconfig/illumon.d/monit/configuration_server.conf
Key elements:
- pidfile — Location of process PID file.
- start program — Command to start process.
- stop program — Command to stop process.
- health checks — Port checks, resource limits.
- restart policy — When and how to restart.
Process startup order
monit can enforce process startup order through dependencies:
Example dependency chain:
- etcd (no dependencies).
- Configuration Server (depends on etcd).
- Authentication Server (depends on Configuration Server).
- Other services (depend on Configuration Server, Authentication Server).
Configuration:
monit will not start a process until its dependencies are running.
Disabling and enabling processes
Temporarily disable a service
Prevent monit from starting a process:
Permanently disable a service
Rename configuration file:
Re-enable a service
Rename configuration file back:
monit user and permissions
monit runs as root but executes process commands as specified users:
monit daemon: Runs as root (required for process supervision).
Process management: Commands run as irisadmin user.
Deephaven processes: Run as various users (irisadmin, dbquery, dbmerge).
Control commands: Should be run as irisadmin:
Testing monit configuration
Before reloading or restarting:
Expected output: Control file syntax OK
Monitoring monit itself
Ensure monit is always running:
systemd management
Enable monit to start on boot:
Check if enabled:
Configuration files and locations
systemd service: Managed by systemd
Service control: systemctl {start|stop|restart|status} monit
Main configuration: /etc/monitrc (overridable via DH_MONIT_RC environment variable)
Process configurations: /etc/sysconfig/illumon.d/monit/*.conf
Log file: /var/log/deephaven/monit/monit.log
State file: /var/lib/deephaven/monit/monit.state
PID file: /var/run/monit.pid