Startup / Shutdown of Deephaven processes
All Deephaven processes in non-Kubernetes deployments are started and stopped with Monit. Monit is a utility for managing and monitoring processes, programs, files, directories and filesystems on a Unix system. (Refer to https://mmonit.com/monit for more information.)
dh_monit
Deephaven provides the dh_monit
wrapper utility to allow Deephaven administrators to interact with Monit without needing root privileges. Besides wrapping monit
functionality, dh_monit
also adds some Deephaven-specific options:
-up
- ensures Monit is running and refreshed and starts all Deephaven services.-down
- ensures Monit is running and refreshed and stops all Deephaven services.--b/--block
- waits until requested start/stop/restart/up/down operations have reported as complete before exiting.
dh_monit
also introduces sequencing of start, stop, or restart operations when called with the all
option.
For example, monit start all
on an infrastructure node will start with whichever service is first in the list and send start commands to all of them. This is not that bad, because the root dependency configuration_server
is listed first, and the next most common dependency authentication_server
is listed second. However, because the services take some time to start, and monit
fires off all the starts quite quickly, some services will fail to start because a dependency they need is not yet ready. This results in the overall start all
taking longer than is desirable as retries are run on the services that initially failed. dh_monit
, on the other hand, starts the configuration_server
and waits until it is running before starting the authentication_server
and again waiting; then it starts the remainder of the services.
Monit service
The Monit service itself can be checked with the following command:
sudo service status monit
If Monit is not running, it can be started with the following command:
sudo systemctl start monit
To ensure Monit starts up whenever the system restarts, use the following command:
sudo systemctl enable monit
All Monit configuration files for the Deephaven processes are located in:
/etc/sysconfig/illumon.d/monit
Deephaven services
If any Deephaven process terminates unexpectedly, Monit will restart the process automatically.
You can check which processes are running with the following Monit command:
/usr/illumon/latest/bin/dh_monit summary
To monitor the state of services, run:
watch /usr/illumon/latest/bin/dh_monit summary
Check the status of all processes with the following Monit command:
sudo monit status
Check the status of individual processes with the following Monit command:
/usr/illumon/latest/bin/dh_monit status <process name>
For example:
/usr/illumon/latest/bin/dh_monit status iris_controller
Starting and stopping Deephaven services
If a configuration file has been updated, the associated Deephaven processes will typically need to be restarted for the changes to take effect. One exception to this is the Deephaven Controller Process, which allows various properties to be edited without needing a restart.
When a configuration file has been updated that requires a restart of the associated Deephaven processes, use the following commands.
To stop all the configured Deephaven processes:
/usr/illumon/latest/bin/dh_monit stop all
Alternatively, stop and start individual Deephaven processes:
/usr/illumon/latest/bin/dh_monit stop <process name>
/usr/illumon/latest/bin/dh_monit start <process name>
For example:
/usr/illumon/latest/bin/dh_monit stop authentication_server
/usr/illumon/latest/bin/dh_monit start authentication_server
Stale Process ID Files
Monit checks for processes using the IDs referenced in /etc/deephaven/run/*.pid
. After a machine reboot, these files may be stale, meaning they contain the ID of a running process that is not the managed Deephaven process. To prevent stale PID files, you can remove them on reboot. This can be accomplished with a crontab
file.
To add a crontab
file as root, run:
sudo crontab -e`
The following entry runs rm -f /etc/deephaven/run/*.pid
on each reboot:
@reboot rm -f /etc/deephaven/run/*.pid
You can verify the contents of the crontab
file with sudo crontab -l
.
Note