---
title: File handle management
---

Deephaven reads table data directly from the filesystem without a centralized database process managing file access. This design provides performance and scalability benefits, but requires understanding how Deephaven interacts with files.

This page covers two related topics:

- **File path identity assumptions** — Constraints on external file operations while Deephaven references those files.
- **TrackedFileHandleFactory** — How Deephaven manages open file descriptors to stay within OS limits.

## File path identity assumptions

Deephaven assumes that a file path always refers to the same physical file for as long as Deephaven is referencing that file. On Linux, this means the device ID and inode must remain constant — the path must continue to point to the same underlying file.

Operations that change the underlying inode or remove the file are not safe while Deephaven is referencing the file (see [TrackedFileHandleFactory](#trackedfilehandlefactory) below).

| Disallowed operation           | Reason                                   |
| ------------------------------ | ---------------------------------------- |
| Delete file                    | Never delete files while referenced      |
| Replace file (delete + create) | Creates new inode; violates identity     |
| Rename/move file               | Path no longer resolves to expected file |

### Consequences of violations

If file identity assumptions are violated:

- **Read errors**: Deephaven may attempt to read from a file that no longer contains expected data, resulting in I/O errors or exceptions.
- **Write errors**: Deephaven may attempt to write to a file that has been moved or replaced, resulting in data loss or corruption.
- **Silently incorrect data**: In the worst case, the replacement file may have a compatible structure but different content, causing Deephaven to return wrong results without any error.

> [!CAUTION]
> Deephaven performs best-effort detection of file identity violations using `java.nio.file.attribute.BasicFileAttributes#fileKey`, but support depends on the OS and filesystem. Do not rely on this detection — always ensure files remain stable while Deephaven processes reference them.

## `TrackedFileHandleFactory`

To avoid exhausting operating system file descriptor limits (`ulimit`), Deephaven uses `TrackedFileHandleFactory` — a least-recently-opened cache for file handles. This factory automatically manages the number of open file handles by closing older handles when capacity is reached, regardless of whether those handles are still in use.

### How it works

1. When Deephaven opens a file, `TrackedFileHandleFactory` creates a tracked file handle and adds it to a queue ordered by open time.
2. When the number of open handles reaches capacity, cleanup is triggered synchronously before the new handle is created.
3. Cleanup first removes handles that were already closed or garbage collected, then reclaims least-recently-opened handles until usage drops below the target threshold (90% of capacity by default) — even if those handles are still strongly referenced.
4. A background cleanup job runs periodically (every 60 seconds by default) and performs the same cleanup, enforcing the target threshold regardless of whether handles are still in use.

> [!IMPORTANT]
> File handles may be closed asynchronously by the factory at any time, even while strongly referenced and in use. Code that holds a `FileHandle` must tolerate the underlying file channel being closed unexpectedly and handle re-opening if necessary.

### Configuration

Configure `TrackedFileHandleFactory` using Deephaven properties:

| Property                                             | Type    | Default     | Description                                                                                                                                              |
| ---------------------------------------------------- | ------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `TrackedFileHandleFactory.maxOpenFiles`              | Integer | (see below) | Maximum number of file handles to keep open simultaneously. Set this below your system's `ulimit -n` value to leave headroom for other file descriptors. |
| `TrackedFileHandleFactory.fastCyclingThreshold`      | Double  | 0.20        | Fraction of capacity that, if reclaimed within the cycling interval, triggers a warning log. Indicates the system may be under-provisioned.              |
| `TrackedFileHandleFactory.fastCyclingIntervalMillis` | Long    | 60000       | Interval (in milliseconds) for detecting fast handle cycling.                                                                                            |

**Default `maxOpenFiles` by process type:**

| Process type             | Default |
| ------------------------ | ------- |
| Most processes           | 4096    |
| Core+ workers            | 1024    |
| Data Import Server (DIS) | 512     |

### Sizing recommendations

The `maxOpenFiles` setting depends on your workload and system limits:

- **Check system limits**: Run `ulimit -n` to see your per-process file descriptor limit.
- **Leave headroom**: Set `maxOpenFiles` to 70-80% of your `ulimit` to reserve descriptors for network connections, logging, and other system needs.
- **Monitor cycling log entries**: If you see frequent "reclaimed N file handles" warnings in logs, consider increasing `maxOpenFiles` or your system's `ulimit`.

**Example configuration:**

```properties
TrackedFileHandleFactory.maxOpenFiles=8000
```

### Limitations

`TrackedFileHandleFactory` provides limited protection against file identity violations:

- It tracks file _handles_, not file _identity_. If a file is replaced while a handle exists, the handle continues to reference the old (now-deleted) file until it's explicitly closed.
- Detection of file identity changes relies on `BasicFileAttributes#fileKey`, which is OS and filesystem dependent — it may not always work.
- Errors from deleted files may surface much later than the deletion. Deephaven won't detect the issue until the handle goes through a reclaim/reopen cycle.
- Relying on the factory to "protect" against file replacement is not safe — always follow the [file identity guidelines](#file-path-identity-assumptions).

> [!WARNING]
> `TrackedFileHandleFactory` is a resource management mechanism, not a safety mechanism. It helps avoid file descriptor exhaustion but does not protect against file replacement or deletion issues.

## Best practices

1. **Never delete or replace data files while Deephaven processes are running** that might reference them. If you must remove data, stop the relevant queries or processes first.

2. **Use the merge process for data lifecycle management.** The [merge process](../../data-guide/merging.md) safely transitions intraday data to historical format with proper coordination.

3. **Size `maxOpenFiles` appropriately.** Setting it too low causes excessive handle cycling (performance degradation); setting it too high risks hitting system limits.

4. **Monitor file handle metrics.** Watch for "reclaimed file handles" warnings in logs, which indicate the system is cycling handles frequently.

5. **Coordinate external file operations.** If external tools write to or manage files that Deephaven reads, ensure they follow append-only patterns or coordinate with Deephaven's data lifecycle.

## Related documentation

- [Filesystem data layout](./table-storage-filesystem.md)
- [Table storage overview](./table-storage-overview.md)
- [Merging](../../data-guide/merging.md)
- [Data lifecycle](../architecture/data-lifecycle.md)
