Package io.deephaven.parquet.base
Class ParquetFileReader
java.lang.Object
io.deephaven.parquet.base.ParquetFileReader
Top level accessor for a parquet file which can read both from a file path string or a CLI style file URI,
ex."s3://bucket/key".
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
final org.apache.parquet.format.FileMetaData
-
Method Summary
Modifier and TypeMethodDescriptionstatic ParquetFileReader
create
(@NotNull File parquetFile, @NotNull SeekableChannelsProvider channelsProvider) Make aParquetFileReader
for the suppliedFile
.static ParquetFileReader
create
(@NotNull URI parquetFileURI, @NotNull SeekableChannelsProvider channelsProvider) Make aParquetFileReader
for the suppliedURI
.Get the name of all columns that we can know for certain (a) have a dictionary, and (b) use the dictionary on all data pages.getRowGroup
(int groupNumber, String version) Create aRowGroupReader
object for provided row group numberorg.apache.parquet.schema.MessageType
int
-
Field Details
-
FILE_URI_SCHEME
- See Also:
-
fileMetaData
public final org.apache.parquet.format.FileMetaData fileMetaData
-
-
Method Details
-
create
public static ParquetFileReader create(@NotNull @NotNull File parquetFile, @NotNull @NotNull SeekableChannelsProvider channelsProvider) - Parameters:
parquetFile
- The parquet file or the parquet metadata filechannelsProvider
- TheSeekableChannelsProvider
to use for reading the file- Returns:
- The new
ParquetFileReader
-
create
public static ParquetFileReader create(@NotNull @NotNull URI parquetFileURI, @NotNull @NotNull SeekableChannelsProvider channelsProvider) - Parameters:
parquetFileURI
- The URI for the parquet file or the parquet metadata filechannelsProvider
- TheSeekableChannelsProvider
to use for reading the file- Returns:
- The new
ParquetFileReader
-
getChannelsProvider
- Returns:
- The
SeekableChannelsProvider
used for this reader, appropriate to use for related file access
-
getColumnsWithDictionaryUsedOnEveryDataPage
Get the name of all columns that we can know for certain (a) have a dictionary, and (b) use the dictionary on all data pages.- Returns:
- A set of parquet column names that satisfies the required condition.
-
getRowGroup
Create aRowGroupReader
object for provided row group number- Parameters:
version
- The "version" string from deephaven specific parquet metadata, or null if it's not present.
-
getSchema
public org.apache.parquet.schema.MessageType getSchema() -
rowGroupCount
public int rowGroupCount()
-