Log files

Some Deephaven processes record events, errors and other application messages in their process log files. These are plain text files that can be viewed or searched with grep, tail and other text processing utilities.

Log search term definitions

General

  • FATAL - Any trace logged at log level FATAL should indicate that the process is beginning to shutdown due to an error.
  • ERROR - Generally, the rate of ERROR level trace may be interesting, but many traces logged at this level are not critical.

Lifecycle events

  • <process name> INITIALIZING - The process has begun its initialization.
  • <process name> RUNNING - Initialization has completed, and the process is available.
  • <process name> ALIVE - The process continues to be available, logged on a configurable cycle.
  • <process name> SHUTTING_DOWN - The process is begging to shut down.

Process specific

For db_dis:

  • Rejecting and closing channel - This should occur seldom, if at all. This message generally points to a configuration issue to which DBAs should be alerted.
  • DataImportStreamProcessor-.+WARN(regex) - If a high rate of warning trace messages are seen from this component (other than during tailer restarts) it may point to a system issue.

Startup script log files

These capture the stdout and stderr streams from Deephaven production processes.

This file appears in the process’s log directory, with name format process_name.log.yyyy-MM-dd, where the date is the process startup date.

This file is not rotated. Be careful with automated log cleanup for long-running processes in order to ensure that this file is not deleted while still in-use.

Application log files

By default, application log files for Deephaven processes are written to the following locations:

/var/log/deephaven/<Application Log Directory>/<Log Prefix>.log.current

where the Application Log Directory and Log Prefix are different for each application.

For example, to view the current log file for the Authentication Server:

cat /var/log/deephaven/authentication_server/AuthenticationServer.log.current

These log locations are defined by properties in the Deephaven property files and can be changed. The root logging directory is defined by the logroot property, which has the following default value:

logroot=/var/log/deephaven

Each process can define its own log directory, using the form logDir.<process name>=<log directory>, and allowing the <logroot> substitution to use the logroot definition as a starting directory. For example, the following property defines the authentication server's log directory:

logDir.authentication_server=<logroot>/authentication_server

The following property defines a default log directory for process names which don't have their own definitions:

defaultLogDir=<logroot>/misc

The location can be changed by passing the logDir parameter into the JVM when a process is started. For example, passing the following parameter to a process changes the application log file location to the defined value:

-DlogDir=/db/TempFiles/logs

A new log file is created every 15 minutes and the .current file is a hard link to the current log file. The initial log file will have a timestamp matching the time the process started and subsequent log files will have the 15-minute timestamp interval.

The actual log file name has a timestamp suffix. For example:

AuthenticationServer.log.2018-01-30-170000.000+0000

For example, the following files were created over a one-hour period:

AuthenticationServer.log.2018-01-30-170000.000+0000
AuthenticationServer.log.2018-01-30-171500.000+0000
AuthenticationServer.log.2018-01-30-173000.000+0000
AuthenticationServer.log.2018-01-30-174500.000+0000

To follow (tail) the current log file in a terminal, use tail command with -F, so it will retry when the new file is created at the 15-minute interval. For example:

cd /var/log/deephaven/authentication_server
tail -F AuthenticationServer.log.current

Query Worker log samples

Deephaven workers do not produce the files described in this section. Worker logs are sent to the Process Event Log table directly or via the RemoteQueryDispatcher, depending on the type of log event.

This has implications for monitoring system processes running as workers. Such monitoring must either be done by querying the process event log table, or by consuming binary logs that are written by the LogAggregatorService as an intermediate step.

By default, Query Workers are configured to write logs via the LogAggregationService and DataImportServer to the DbInternal/ProcessEventLog table in Deephaven.

To enable Query Worker logs to write their logs to plain text files, the following properties can be set in /etc/sysconfig/illumon.d/resources/iris-common.prop:

RemoteQueryDispatcher.writeDatabaseProcessLogs=false
RemoteQueryProcessor.sendLogsToSystemOut=true

The following sample Query Worker log entries provide a few samples for what might be monitored here. Note that in general, monitoring worker logs is not advisable, because a lot of the content may result from user code in the query itself. However, monitoring for the expected output from well-defined queries does make sense.

[2018-02-20T23:04:19.476898-0500] - WORKER_1 - ConnectionMonitor: registering monitored connection job:1744982467/AuthServer_Client/10.128.0.38:42466->10.128.0.38:9030/CommandConnection
[2018-02-20T23:04:19.484982-0500] - WORKER_1 - Successful authentication worker8.internal/10.128.0.38 using delegate token {-1691233174684810865, service
: DelegatedAuthentication, origin: worker8.internal/10.128.0.38, tokenContext: {iris}}
[2018-02-20T23:04:19.486329-0500] - WORKER_1 - Serializing result of type java.lang.Boolean
[2018-02-20T23:04:19.486838-0500] - WORKER_1 - Sending serialized result, length=47 bytes

Java Garbage Collector logs

Deephaven Java processes are configured to record JVM Garbage Collection (GC) statistics. For query workers, the GC statistics are recorded to the Process Event Logs. Other processes will record GC statistics to the logs directory. For example, the Data Import Server's GC log can be found at:

/var/log/deephaven/dis/db_dis.log.gc_detailed.<datetime>

where <datetime> is: <YYYY-MM-DD_HH-MM-SS>.

For example:

/var/log/deephaven/dis/db_dis.log.gc_detailed.2018-02-01_13-00-42
/var/log/deephaven/dis/db_dis.log.gc_detailed.2018-02-01_20-10-06
/var/log/deephaven/dis/db_dis.log.gc_detailed.2018-02-05_10-23-15
/var/log/deephaven/dis/db_dis.log.gc_detailed.2018-02-06_18-38-27

The GC logs can be very useful to understand JVM memory usage or diagnose any memory problems.

GC log patterns

With default Deephaven JVM start-up parameters, there are four GC log events that should be monitored:

  1. Parallel young generation collections that occur very frequently (more often than once per 10 seconds), or with significant real-time duration (longer than 0.2 seconds):
[GC (Allocation Failure) 2018-02-21T04:28:31.559+0000: 14.240: [ParNew: 27679K->16000K(144384K), 0.0226034 secs]2018-02-21T04:28:31.582+0000: 14.263: [CMS: 143620K->141747K(320896K), 0.4811685 secs] 168566K->141747K(465280K), [Metaspace: 36101K->36101K(1083392K)], 0.5040269 secs] [Times: user=0.55 sys=0.01, real=0.50 secs]
  1. Full GCs, which usually result from failures of the concurrent garbage collector or other near out-of-memory conditions:
2018-02-21T04:31:33.822+0000: 196.503: [Full GC (System.gc()) 2018-02-21T04:31:33.822+0000: 196.503: [CMS: 532358K->532883K(707840K), 0.6859044 secs] 566077K->532883K(852352K), [Metaspace: 36261K->36261K(1083392K)], 0.6864226 secs] [Times: user=0.70 sys=0.02, real=0.68 secs]

Note that in this case, System.gc() above indicates that this was invoked by a user deliberately, although this is not a guarantee given that Java built-in libraries use the same mechanism sometimes. 3. Concurrent mode failures in the old generation collector:

(concurrent mode failure): 658018K->180450K(672472K), 0.5520093 secs] 786112K->180450K(816856K), [Metaspace: 36011K->36011K(1083392K)], 0.5647268 secs] [Times: user=0.69 sys=0.03, real=0.57 secs]
  1. It may also be worth monitoring for frequency of old generation collections (CMS) to detect cases when a process is using more CPU than needed because it needs a larger heap:
2018-02-21T04:36:27.709+0000: 76.357: [GC (CMS Initial Mark) [1 CMS-initial-mark: 219607K(601812K)] 258670K(872660K), 0.0010968 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

You should be notified if any non-worker process is starting CMS more often than once per minute.

Stats Driver logs

If the Stats Driver property is enabled for a given Deephaven process, various Deephaven process statistics are recorded in the logs directory.

For example, the Data Import Server's stats log can be found at:

/var/log/deephaven/dis/db_dis.log.stats.current

The <file>.stats.current file is a hard link to the current stats file. A new stats file is created every hour.

These stats files are for debugging purposes along with the following utility script:

/usr/illumon/latest/bin/stats

  • Allows users to select individual columns of the stat logs and pipe them to stdout.
  • Allows users to order and filter the stat logs columns output to meet their needs.
  • Creates the output in CSV format to be used in monitoring applications.

The Stats Driver property is enabled by default for Deephaven processes. To turn off statistics, disable the property in the process configuration file as follows:

statsdriver.enabled=false

The application statistics log contains entries of the performance metrics kept for each process. These are recorded on disk in CSV format for easy import into a multitude of external performance monitoring tools. As noted under the Deephaven Performance section, the statistics are also available in the DbInternal database.

The entries are in CSV format and have the following headers: Stat,IntervalName,NowSec,NowString,AppNowSec,AppNowString,TypeTag,Name,N,Sum,Last,Min,Max,Avg,Sum2,Stdev

This data should be imported into a query-able monitoring tool to write alarms against.

Binary log files

Several Deephaven processes write binary log files, which contain data that will be written to tables in the DbInternal namespace. The locations of these log files are driven by properties similar to the application log files, including the logroot property.

When writing a binary log file, the application will determine the location using a three-tiered search, using the namespace and table name; the most specific result found is used as the location.

binaryLogDir.<namespace>.<table name>
binaryLogDir.<namespace>
binaryLogDir

If no location is defined for the table or namespace, the location defined by the binaryLogDir property will be used.

The default location for binary logs uses the same log root as the application logs (/var/log/deephaven), but defines a different subdirectory:

binaryLogDir=<logroot>/binlogs

Process event logs (see below) have their own subdirectory:

binaryLogDir.DbInternal.ProcessEventLog=<logroot>/binlogs/pel

Performance logs have a different subdirectory.

binaryLogDir.DbInternal.QueryOperationPerformanceLog=<logroot>/binlogs/perflogs
binaryLogDir.DbInternal.QueryPerformanceLog=<logroot>/binlogs/perflogs
binaryLogDir.DbInternal.UpdatePerformanceLog=<logroot>/binlogs/perflogs

Viewing Binary Log Files with iriscat

Because Deephaven log files on disk are in binary format, an alternative way of viewing log records is to use the iriscat utility. This can come in handy when there are system errors preventing you from querying the database using the Deephaven Console GUI client.

To view binary files using iriscat, first make sure that JAVA_HOME is set.

export JAVA_HOME=/usr/java/latest

Once JAVA_HOME is set, you can run iriscat using this command:

/usr/illumon/latest/bin/iriscat <path to binary file>

For example:

/usr/illumon/latest/bin/iriscat
/var/log/deephaven/binlogs/test.file.bin.2018-02-02.141436.113-0800

This will display the contents of the file in CSV format. To view them in the same file as formatted text use the -l flag. For example:

/usr/illumon/latest/bin/iriscat -l <path to binary file>

Here is the full list of options for the iriscat utility:

  • -C <columns> - Names of columns to display, comma-separated.
  • -d - Show debugging information (row flags, column metadata).
  • -e <endRow> - Index of the last row displayed.
  • -F <format> - Format date time columns, comma-separated (e.g., "Column=HH:mm:ss.SSS", or "Column" for ISO8601 microsecond).
  • -l - Format output as if it were a log file (useful for ProcessEventLog).
  • -s <startRow> - Index of the first row displayed.
  • -T <timeZone> - Time zone for time columns (e.g., DST or CST).

Deephaven Database event logs

Some Deephaven components can be configured to log their events to the Deephaven database tables in place of text files. This dramatically improves the ability to search through logs and find related events, especially for query workers where a user would otherwise need access to the server running the worker to view its log. The full power of the Deephaven query language is available to view these table-stored event logs, instead of only the basic text processing utilities like grep, awk, sed, etc.

For example, if you want to see the logs for worker_9, you can use the following query to view them. (Note: You will need appropriate permissions to do so.)

t=db.i("DbInternal","ProcessEventLog").where("Date=currentDateNy()","Process=`worker_9`")

Process Event logs

Table NamespaceTable NameDescription
DbInternalProcessEventLogRecords log records for configured Deephaven Processes. Rows include the classic Log entry, Log Level, Date, Time, etc. Non-privileged users can only see records for which the EffectiveUser matches their username.

Sample query

// Show all events for today
today = new Date().format("yyyy-MM-dd")
t=db.i("DbInternal", "ProcessEventLog").where("Date=`" + today + "`")

// Show all ERROR Level events for today
t=db.i("DbInternal", "ProcessEventLog").where("Date=`" + currentDateNy() + "`").where("Level=`ERROR`")

Note

See also: ProcessEventLog

Audit Event logs

Table NamespaceTable NameDescription
DbInternalAuditEventLogCaptures all Audit events such as user login attempts, process startup events, etc. Non-privileged users can only see records for which the EffectiveUser matches their user name.

Sample query

// Show all Audit events for today
today = new Date().format("yyyy-MM-dd")
t=db.i("DbInternal",  "AuditEventLog").where("Date=`" + today + "`")

Note

See also: AuditEventLog

Log file maintenance

Customers should create a job to archive these logs based on their own internal retention policies. Deephaven recommends at least a few weeks of logs be retained in case issue troubleshooting is needed.