Table storage
Data categories
At the highest level, Deephaven divides persistent data according to two criteria:
- Namespace type (System or User)
- Availability type (Intraday or Historical)
Namespace type
Namespace type is used to categorize data broadly by its purpose and importance.
System namespaces
System namespaces are those that are subject to a structured administrative process for updating schemas as well as for importing, merging, and validating data. Their schemas are defined via Deephaven schema files, which are often kept in version control systems for collaboration and revision history tracking, and updated on a business-appropriate schedule by administrative users. Queries cannot directly modify tables in these namespaces via the Database APIs, and are typically run by a user without the necessary filesystem permissions to modify the underlying files.
Any data that is important for business processes or used by many individuals should be in a system namespace.
User namespaces
User namespaces are directly managed by unprivileged users via the Database
APIs. They typically do not have external schema files, and are usually exempted from other administrative processes.
User namespaces are typically used to persist intermediate query results or test out research ideas. It is easy and often appropriate to migrate data from a user namespace to a system namespace if it becomes more important than originally conceived.
Availability type
Availability type is used to categorize data by its timeliness and stability. Note that it's somewhat less meaningful to talk about intraday data for user namespaces, but not entirely irrelevant.
Intraday data
Intraday data for system namespaces is internally partitioned (usually by source) before applying column partitioning, and stored within each partition in the order in which it was received. It may be published in batches, for example, via offline import, or it may be appended in near real-time as data becomes available.
User namespaces do not necessarily follow the same convention for internal partition naming.
Historical data
Historical data for system namespaces may be stored either in Deephaven partitioned layout, or a more flexible Extended layout. Note that Extended layouts are only supported by Core+ workers, but allow the use of more than one partitioning column.
When the merge process is used to convert Intraday tables into historical tables, data is partitioned according to storage load-balancing criteria (as documented in Tables and schemas) before applying column partitioning. It is often re-ordered during the merge by applying sorting or grouping rules, but relative intra-source order is preserved otherwise. Validation processes take place after the merge to ensure that the newly available historical data matches the intraday data it was derived from and meets other domain-specific invariants.
Table layouts and data partitioning
Deephaven currently supports two possible table layouts:
- Splayed tables are simple column-oriented data stores without any partitioning, meaning they are stored in one location directory. They are typically used for storing query result sets as tables in user namespaces.
- Partitioned tables are hierarchical stores of many splayed tables, with each splayed table representing a single partition of the data. Partitions are automatically combined into a single table for presentation purposes by the Deephaven query engine, although this step is deferred for optimization purposes by well-crafted queries.
- Extended layouts are more complicated layouts only supported by Core+ workers for Parquet-formatted and Deephaven-formatted Historical tables.
In practice, all system namespace tables are partitioned, although user namespace tables may also use this layout.
The partitioning scheme currently used by Deephaven is referred to as nested partitioning (see the example below for reference), and is implemented by using a different directory for each partition. This permits two levels of hierarchy, which function as follows:
- The top level of the hierarchy (e.g., IntradayPartition1, IntradayPartition2, or PartitionName1, PartitionName2 in the examples) is hidden from user queries, but serves multiple purposes.
- For historical system data, this partitioning allows storage boundaries to be introduced for load balancing purposes. Also, queries may leverage grouping metadata in individual partitions to defer indexing work in some cases.
- For intraday system data, this partitioning keeps data from distinct sources in different locations. For example, if two different processes write data into the same table, they must write into different top-level intraday partitions to keep the data separated.
- For user data, this partitioning is optionally used to allow parallel publication or to achieve other user goals.
- The bottom level of the hierarchy (e.g., Date1, Date2 in the example) is visible to user queries as a partitioning column, typically with a descriptive name like Date. This efficiently limits the amount of data that must be read by allowing the query engine to prune the tree of partitions (directories) that need not be considered. Users must place filters using only the partitioning column ahead of other filters in a where clause to take advantage of these benefits.
Filesystem data layout
Design goals
In specifying the layout for Deephaven data and metadata, two paramount goals are emphasized:
- Decentralization - Deephaven avoids the need for a centralized server process that uniquely administers access to the database files.
- Ease of administration - Deephaven databases, namespaces, and tables are laid out in such a way that:
- It is easy for administrators to introduce filesystem boundaries for provisioning purposes, e.g., between user and system tables, or for the various storage partitions of a given system namespace's tables.
- Adding, moving, or removing database objects can be done with standard file operations, whether operating on the database itself or on a single namespace, partition, table, or column.
Locations
Deephaven tables use two levels of partitioning:
- Internal partitions divide up data. For intraday tables this is usually to separate data coming from multiple sources. For historical data, this is to allow distribution of data across multiple storage devices.
- Partitioning values (typically a date string based on when the data was generated), allow natural division of large tables into more manageable "chunks". This type of partitioning works the same for both intraday and historical tables.
A location is the directory that corresponds to a particular combination of internal partition and partition value for a table. For instance, if a historical table has three writable partitions, and uses Date for its partitioning column, partition 1 with Date=2018-01-04 would be an example of a location for this table. The location is the lowest directory in the table directory structure, and will contain column data files and other table metadata files corresponding to that location's "slice" of the table.
Root Directories
Every Deephaven installation defines a database root directory, usually /db
.
- Intraday data for system namespaces is stored under the database root in the Intraday directory, entirely separate from historical data.
- Historical data for system namespaces is stored under the database root in the Systems directory.
- User namespace data is stored under the database root in the Users directory.
- Administrators may create an intraday data directory for user namespaces when configuring support for centrally-appended user data, but this is optional.
Each of these directories contains a well-defined set of subdirectories.
Intraday
This contains all the Intraday data with subdirectories as shown below.
/db/Intraday
|--Namespace (one directory for each namespace)
| |--TableName1 (one directory for each table)
| | |--IntradayPartition1 (one directory for each source of the data)
| | | |--Date1 (one directory for each column partition value)
| | | | |--TableName1 (matches the directory name three levels up)
| | | | | (contains the files which store the data)
| | | |--Date2
| | | | `--... (matches the Date1 directory structure)
| | | `--... (additional dates)
| | |--IntradayPartition2
| | `--... (additional data sources)
| |--TableName2 (directory structure will be similar to TableName1)
| |--`... (additional tables)
Note that the table name is in the directory structure twice. This is intentional.
- The first level (under the namespace) allows easy management of all the data for any table. For example, if a table is not needed any more, it can be deleted at that level.
- The second level (at the final level of the directory tree) is symmetrical with the historical directory structure layout (see below), allowing easy portability when desirable.
For a concrete example, the directory containing the intraday data files for a single date partition for the DbInternal Persistent Query State Log table might be:
/db/Intraday/DbInternal/PersistentQueryStateLog/myhost.illumon.com/2017-08-04/PersistentQueryStateLog/
Historical (Systems)
The Systems subdirectory contains all the historical data with subdirectories as shown below.
/db/Systems
|--Namespace (one directory for each namespace)
| |--MetadataIndex (directory for table location lookup)
| | |--TableName1.tlmi (TableName1 locations index file)
| | |--TableName2.tlmi
| | `--... (location index files for each table))
| |--WritablePartitions (directories for writing historical data)
| | |--PartitionName1 (usually links to Partitions/PartitionName1)
| | |--PartitionName2 (usually links to Partitions/PartitionName2)
| | `--... (additional linked directories as appropriate)
| |--Partitions directories for reading historical data)
| | |--PartitionName1 (internal partition for reading data)
| | | |--Date (one directory for each column partition value)
| | | |--Date1 (one directory for each column partition value)
| | | | |--TableName1 (one directory for each table)
| | | | | |--Index-Col (one directory for each grouping or data index column)
| | | | | (contains the files which store the data)
| | | | |--TableName2
| | | | `--... (additional tables)
| | | |--Date2
| | | | `--... (matches the Date1 directory structure)
| | | `--... (additional dates)
| | |--PartitionName2
| | | `--... (matches the PartitionName1 structure)
| | `--... (additional partitions as needed)
| | |--Extended (Directories for reading Extended layouts)
`--... (additional namespaces)
For example, an initial layout for the DbInternal namespace might be as follows, showing only the Partitions and Writable Partitions subdirectories. Data is being written to and read from the internal partition called "0".
Note
|--DbInternal
| |--WritablePartitions
| | |--0(links to Partitions/0)
| |--Partitions
| | |--0
| | | |--2017-08-03
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | |--2017-08-04
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | `--...
Once the initial partitions to which data is being read exceed storage capacity, new partitions and links should be created by the sysadmin as demonstrated below. At this point new data is being added to the internal partition called "1", but read from both "0" and "1".
|--DbInternal
| |--WritablePartitions
| | |--1(links to Partitions/1)
| |--Partitions
| | |--0
| | | |--2017-08-03
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | |--2017-08-04
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | `--...
| | |--1
| | | |--2017-09-01
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | |--2017-09-02
| | | | |--PersistentQueryStateLog
| | | | |--PersistentQueryConfigurationLog
| | | | `--...
| | | `--...
For a concrete example, the directory containing the historical data files for a single date partition for the DbInternal persistent query state log table might be:
/db/Systems/DbInternal/Partitions/0/2017-08-03/PersistentQueryStateLog
Extended Layouts
Extended Layouts add support for more complex table layouts that may be created from other tools, such as Apache Hadoop. Table Schemas must define the Extended Layout type to use this feature. Most notably, tables in the Extended Layout may contain more than one partitioning column.
Deephaven supports the following layout for both Deephaven and Parquet tables:
- coreplus:hive layouts consist of a tree of directories, one level for each partitioning column, where each directory is named
PartitionColumnName=PartitionValue
. The leaf directories contain either Parquet files or Deephaven format tables.
Deephaven supports these layouts for Parquet tables:
- parquet:flat layouts consist of a single directory containing multiple Parquet files that are each a fragment of the whole table.
- parquet:kv are equivalent to
coreplus:hive
layouts. - parquet:metadata layouts are similar to
parquet:kv
layouts with an added_metadata
and optional_common_metadata
file in the root directory that explicitly defines what files various table partitions exist in.
parquet:flat
Tables in the flat
layout are simply a number of Parquet files contained in the Extended directory.
|--/db/Systems/MyNamespace
| |-- Extended
| | |-- Companies
| | | |-- part.0.parquet
| | | |-- part.1.parquet
| | | |-- part.2.parquet
| | | |-- ...
parquet:kv
and coreplus:hive
The kv
and hive
layouts are hierarchical layouts where each level in the directory tree is one level of partitioning. Directory names are PartitionColumnName=Value
at each level. The leaves may contain Parquet files or Deephaven Tables.
|--/db/Systems/MyNamespace
| |--Extended
| | |-- Companies
| | | |-- PC1=Val1
| | | | |-- PC2=Val1
| | | | | |-- data.parquet
| | | | |-- PC3=Val2
| | | | | |-- data.parquet
| | | | |-- ...
| | | |-- PC2=Val2
| | | |-- PC3=Val3
| | | |-- ...
Hive Locations Table
Just as when using a standard Enterprise table layout, the Deephaven engine traverses the table's directory structure to discover partitions. Depending on the performance characteristics of your file system (e.g., if it must traverse a WAN), partition discovery can take significant time. To accelerate this process, standard layout tables can take advantage of a metadata index that only supports a single internal partition and column partition. For Hive layout tables, instead of a metadata index, you must use a Locations table, which provides the engine with a list of all partitions and the corresponding files. If the .locations_table
subdirectory exists in the root of the table (e.g., /db/Systems/MyNamespace/Extended/MyTableName/.location_table
using the above layout), then the worker reads the locations table instead of traversing directories.
The Deephaven merge process does not support creating Hive layout tables. Tables must be written into the correct directory structure manually. When using a Locations table, it must be updated along with the underlying data. The HiveLocationsTableKeyFinder.writeLocationsTable method scans the underlying storage, generates a new Locations table and writes it to the correct location. The scanning process must traverse the entire directory structure of the table. As you change partitions in the data store, you can manually append newly created files to the Locations table instead. The last entry for a file in the Locations table is used during partition discovery.
Caution
If using a Locations table, the system does not use the underlying data discovery mechanism. You must keep the Locations table in sync with the actual table locations. If the Locations table does not match the underlying data, you will see null rows (when the Locations table includes more rows than the underlying data), or rows will be missing in the table (when the Locations table does not represent the underlying data).
parquet:metadata
The metadata
layout is similar to the kv
layout, except that it includes a _metadata
and optional _common_metadata
file that store the paths to each Parquet file that makes up the table.
The directory structure may be the same as kv
(canonically the hive
layout`), or the directory names may simply be the partition values at each level as shown below.
|--/db/Systems/
| |--Extended
| | |--Companies
| | | |-- _metadata
| | | |-- _common_metadata
| | | |-- Val1
| | | | |-- Val2
| | | | | |-- data.parquet
| | | | |-- Val3
| | | | | |-- data.parquet
| | | | |-- ...
| | | |-- Val4
| | | |-- PC3=Val3
| | | |-- ...
Users
The Users subdirectory follows the same layout as the Systems subdirectory, except it also has the Tables directory for non-partitioned splayed tables, and lacks WritablePartitions.
/db/Users
|--Namespace (one directory for each user namespace)
| |--Definitions (table definitions managed by Deephaven)
| | |--.TableName1-LOCKFILE (hidden) (for internal Deephaven usage)
| | |--.TableName2-LOCKFILE (hidden)
| | `-- (hidden lockfiles for each table)
| | |--TableName1.tbl (metadata file for TableName1 table)
| | |--TableName2.tbl
| | `--... (metadata for each table)
| |--Partitions (matches the Systems partitions structure)
| |--Tables (contains data for non-partitioned tables)
| | |--TableName1 (one directory for each table contains the files
| | | that store the data)
`--... (additional namespaces)
Location indexing (Metadata Indexes)
If a query for a historical table is executed in a Deephaven console without partitioning column values, such as quotes=db.t("LearnDeephaven","StockQuotes")
, the system must determine what partition values exist, so they can be displayed for the user to select from. This is traditionally accomplished by scanning the directory structure under /db/Systems/LearnDeephaven
to find all the locations. For tables with a large number of Partitions, and a large number of partition values, this process could take a relatively long time.
In the more common case, where a query's first .where()
is selecting a single partition value — quotes=db.t("LearnDeephaven","StockQuotes").where("Date=`2017-08-25`")
— the system still must find all 2017-08-05 directories that exist under the partitions for /db/Systems/LearnDeephaven
before it can begin retrieving data for this date.
To improve performance, .tlmi files cache all the partition values and their locations in a single file per table. This allows the system to find table locations by reading one known file, rather than scanning the filesystem, resulting in much faster initial response time for queries against tables that have a large number of locations.
Location indexing is enabled by default. Normally, the only changes to historical data locations occur when new partition values are written during the merge process. Location indexing can be disabled by setting:
LocalMetadataIndexer.enabled=false
For Hive format tables, you must use a Locations table instead.
Index Management
Location Indexes may be manually updated, validated and listed using the dhctl metadata
tool.
Examples of using the dhctl metadata
command
Index all System tables:
sudo /usr/illumon/latest/bin/dhctl metadata update -t '*'
Index all the tables within the System namespace ExampleNamespace:
sudo /usr/illumon/latest/bin/dhctl metadata update -t ExampleNamespace
Index the System table ExampleTable in ExampleNamespace:
sudo /usr/illumon/latest/bin/dhctl metadata update -t ExampleNamespace.ExampleTable
Rerunning the indexer command on a system that has already been indexed will replace the .tlmi
files with refreshed versions, even if nothing has changed.
Grouping
Data within a given historical table may be grouped according to one or more columns. For example, a table named Quotes
might be grouped by UnderlyingTicker
and Ticker
. This means that rows within a single partitioned table location or splayed table are laid out such that all rows with the same value for UnderlyingTicker
are adjacent, as are all rows with the same value for Ticker
.
Deephaven requires that tables with multiple grouping columns allow for a total ordering of the grouping columns such that each unique value group in later columns is fully enclosed by exactly one unique value group in each earlier column. That is, groups in more selective grouping columns must have a many to one relationship with groups in less selective grouping columns.
This type of relationship is natural in many cases. For example, UnderlyingTicker
and Ticker
can be used together because they form a hierarchy - no Ticker
will belong to more than one UnderlyingTicker
. In this way, grouped data is similar to how entries are sorted in a dictionary, or how a clustered index is modelled in a relational database.
Here's an example of a valid multiply-grouped table, with grouping columns UnderlyingTicker
and Ticker
:
Here's an example of an invalid multiply-grouped table, with grouping columns LastName
and FirstName
:
Note that the FirstName
group Bob can't be enclosed within a single LastName
group - it overlaps with both Simpson and Smith. Deephaven will generate an error when attempting to merge this data with these groups - one grouping or the other must be chosen.
Grouping allows for three categories of optimization:
- Indexing - Grouping columns are automatically indexed (grouped) by the database, allowing for much faster filtering operations for match operations. Example:
quoteTable.where("UnderlyingTicker=`AAPL`")
- Locality - When filtered by grouping columns, data often has much better locality on disk, allowing for more efficient retrieval (
select
orupdate
) operations. Example:quoteTable.where("UnderlyingTicker=`AAPL`").select()
- Implied Filtering - Filtering on a grouping column implies filtering on all earlier grouping columns. Example:
quoteTable.where("UnderlyingTicker=`AAPL`", "Ticker in `AAPL100918C00150000`, `AAPL100918C00155000`"
contains a redundant filter,"UnderlyingTicker=`AAPL`"
Deephaven groups data (and enforces rules on multi-column groupings) when merging intraday data for system namespaces to historical data, according to the column types specified in the schema files.
File extensions
Splayed table directories contain a number of files, including the following types:
.tbl
files store table metadata, including the storage layout (e.g., splayed, partitioned) and the order, name, data type, and special functions (if any) of each column..dat
files store column data sequentially, prefixed by a serialized Java object representing metadata..ovr
files store overflow metadata for .dat files of the same name..bytes
files store BLOBs (Binary Large OBjects) referenced by offset and length from their associated.dat
files..sym
files provide a table of strings referenced by index from their associated.dat
files..sym.bytes
files store strings referenced by offset and length from their associated.sym
files
Column files
Deephaven column files are capable of storing persistent data for all supported types.
- Java primitive types and/or their boxed representations are stored directly in the data region of .dat files, one fixed-width value per row.
With the exception of Booleans, column access methods work directly with the unboxed type, the storage used per row is the same as for the Java representation of the type, and distinguished values (see the
QueryConstants
class) are used to represent null, negative infinity, and positive infinity when appropriate. Generified access methods are included for working with boxed types as a convenience feature.- Boolean - All column methods work directly with the boxed (Boolean) type, for ease of representing nulls. Persistent storage uses 1 byte per row, with values in {-1 (null), 0 (false), 1 (true)}.
- byte (1 byte per row)
- char (2 bytes per row)
- double (8 bytes per row)
- float (4 bytes per row)
- int (4 bytes per row)
- long (8 bytes per row)
- short (2 bytes per row)
DBDateTime
- This type encapsulates a nanosecond-resolution UTC timestamp stored in exactly the same manner as a column of longs, using 8 bytes per row in the associated.dat
file. See Working with Time for more information on working with this type.DBDateTime
supports dates in the range of 09/25/1677 to 04/11/2262.Symbol
- Symbol columns store String data, with an associated lookup table capable of representing 231-1 (approximately 2 billion) unique values. These columns optimize for a small number of unique values. Each row consumes 4 bytes of storage in the .dat file, and each unique non-null value consumes 8 bytes of storage in the.sym
file and a variable length record in the.sym.bytes
file.SymbolSet
-SymbolSet
columns allow efficient storage ofStringSets
from a universe of up to 64 symbol values. They use the same symbol lookup table format as Symbol columns, and 8 bytes per row in the.dat
file.- In addition to the fixed-width types, Deephaven can store columns of any Serializable or Externalizable Java class, including Strings or arrays (of primitives or objects). The BLOBs consume 8 bytes of storage per row in the
.dat
file, and a variable length record in the associated.bytes
file.
S3 data access
Deephaven accesses historical (non-intraday) data that is made available under /db/Systems
. It is possible to use data stored in an S3 repository for queries in Deephaven.
The details below will give an indication of what to expect when querying data in an Amazon S3 repository, though your use cases may vary.
Results
Tests were performed using Goofys to mount an S3 store and expose it as a mounted user-space filesystem on an AWS EC2 host and compare it to a similar NFS-mounted store. Queries run on S3/Goofys data took longer than their NFS counterparts, which was anticipated. The examples, Query 1 and Query 2, were used in our testing.
This chart summarizes the performance of queries run from data exposed via Goofys, relative to similar queries on data exposed via NFS. The REL_TO_NFS
column is a multiplier that shows how Goofys performance compares to its NFS counterpart. Accordingly, the NFS value will always be 1. For example, the Goofys query on Deephaven V1 format data has a relative value of 1.92, meaning it took nearly twice as long. Generally the queries on Goofys ran close to twice the NFS queries. Increasing the data buffer size to 4M and console to 8G did widen the disparity between Goofys and NFS, with Goofys queries running ~2.5 times as long as NFS.
Test queries
Query 1
import com.illumon.iris.db.tables.utils.*
import com.illumon.iris.db.tables.select.QueryScope
ojt = io.deephaven.OuterJoinTools
run_bar_creation = {String namespace, String date, int interval ->
QueryScope.addParam("interval",interval)
quote_bars = db.t("${namespace}", "EquityQuoteL1")
.where("Date=`${date}`")
.updateView("Timestamp = DBTimeUtils.lowerBin(Timestamp, interval * SECOND)")
.dropColumns("MarketTimestamp","ServerTimestamp","InternalCode","LocalCodeMarketId","TradingStatus")
.avgBy("Date","LocalCodeStr","Timestamp")
trade_bars = db.t("${namespace}", "EquityTradeL1")
.where("Date=`${date}`")
.updateView("Timestamp = DBTimeUtils.lowerBin(Timestamp, interval * SECOND)")
.dropColumns("MarketTimestamp","ServerTimestamp","InternalCode","LocalCodeMarketId","MarketId","TradingStatus")
.avgBy("Date","LocalCodeStr","Timestamp")
bars = ojt.fullOuterJoin(trade_bars, quote_bars, "Date,LocalCodeStr,Timestamp")
bars = bars.select()
int sz = bars.size()
quote_bars.close()
trade_bars.close()
bars.close()
return sz
}
ns = (“FeedOS” | “FeedOS_S3” | "FeedOSPQ" | “FeedOSPQ_S3”)
dates = ['2022-05-13', '2022-05-16', '2022-05-17', '2022-05-18', '2022-05-19']
dates.each { date ->
long start = System.currentTimeMillis()
int sz = run_bar_creation.call("${ns}", date, 60)
long t = System.currentTimeMillis() - start
println "${date}: sumBy done in ${t / 1000}s"
}
Query 2
import com.illumon.iris.db.tables.utils.*
ns = ("FeedOSPQ" | “FeedOSPQ_S3”)
dates = ['2022-05-13', '2022-05-16', '2022-05-17', '2022-05-18', '2022-05-19']
dates.each { date ->
long start = System.currentTimeMillis()
sumTable = db.t("${ns}", "EquityQuoteL1").where("Date = `${date}`")
.view("LocalCodeStr", "LocalCodeMarketId", "BidSize")
.sumBy("LocalCodeStr", "LocalCodeMarketId")
long t = System.currentTimeMillis() - start
println "${date}: ${sz} rows in ${t / 1000}s"
}
Goofys install (Linux)
Fuse libs should be installed on the Linux host:
[centos@ip-172-31-2-193 scratch]$ sudo yum list installed | grep fuse
fuse-overlayfs.x86_64 0.7.2-6.el7_8 @extras
fuse3-libs.x86_64 3.6.1-4.el7 @extras
Fuse itself required install:
sudo yum install fuse
Make sure git and make are installed also:
git --version
make --version
Install Go (v1.18.2).
Add the following to your ~/.bashrc
file, after /etc/bashrc
is sourced. If necessary, update GOVERSION
and GOPATH
:
export PATH=$PATH:/usr/local/go/bin
export GOVERSION=go1.18.2
export GO_INSTALL_DIR=/usr/local/go
export GOROOT=$GO_INSTALL_DIR
export GOPATH=/home/centos/golang
export PATH=$GOROOT/bin:$GOPATH/bin:$PATH
export GO111MODULE="on"
export GOSUMDB=off
Get Goofy and build:
git clone https://github.com/kahing/goofys.git
cd goofys
make install
There should now be a Goofys command you can run:
~/golang/bin/goofys -v
Goofys mounts as a user/group, and is not changeable once mounted since file mode/owner/group are not supported POSIX behaviors. So first find UID/GID for dbquery:
[centos@ip-172-31-2-193 s3-share-goofy]$ grep dbq /etc/passwd
dbquery:x:9001:9003:Deephaven Data Labs dbquery Account:/db/TempFiles//dbquery:/bin/bash
Now create a dir that will be the directory backed by S3, and mount it as dbquery.
mkdir ~/s3-share-goofy
~/golang/bin/goofys --uid 9001 --gid 9003 --debug_fuse --debug_s3 perry-s3-project /var/tmp/s3-share-goofy
Test data was copied to the S3 store using the AWS command line interface (CLI). See here for installation instructions.
aws s3 cp myfile.txt s3://my-s3-bucket/mypath/myfile.txt
This was done under a normal user for testing. A system administrator can advise further on setting this up in /etc/fstab
as a permanent mount.