Interface ColumnChunkReader


public interface ColumnChunkReader
  • Field Details

    • NULL_DICTIONARY

      static final org.apache.parquet.column.Dictionary NULL_DICTIONARY
  • Method Details

    • columnName

      String columnName()
      Returns:
      The name of the column this ColumnChunk represents.
    • getURI

      URI getURI()
      Returns:
      The URI of the file this column chunk reader is reading from.
    • numRows

      long numRows()
      Returns:
      The number of rows in this ColumnChunk, or -1 if it's unknown.
    • numValues

      long numValues()
      Returns:
      The value stored under the corresponding ColumnMetaData.num_values field.
    • getMaxRl

      int getMaxRl()
      Returns:
      The depth of the number of nested repeated fields this column is a part of. 0 means this is a simple (non-repeating) field, 1 means this is a flat array.
    • hasOffsetIndex

      boolean hasOffsetIndex()
      Returns:
      Whether the column chunk has offset index information set in the metadata or not.
    • getOffsetIndex

      org.apache.parquet.internal.column.columnindex.OffsetIndex getOffsetIndex(SeekableChannelContext context)
      Parameters:
      context - The channel context to use for reading the offset index.
      Returns:
      Get the offset index for a column chunk.
      Throws:
      UnsupportedOperationException - If the column chunk does not have an offset index.
    • getPageIterator

      ColumnChunkReader.ColumnPageReaderIterator getPageIterator(PageMaterializerFactory pageMaterializerFactory) throws IOException
      Parameters:
      pageMaterializerFactory - The factory to use for constructing page materializers.
      Returns:
      An iterator over individual parquet pages.
      Throws:
      IOException
    • getPageAccessor

      ColumnChunkReader.ColumnPageDirectAccessor getPageAccessor(org.apache.parquet.internal.column.columnindex.OffsetIndex offsetIndex, PageMaterializerFactory pageMaterializerFactory)
      Parameters:
      pageMaterializerFactory - The factory to use for constructing page materializers.
      Returns:
      An accessor for individual parquet pages which uses the provided offset index.
    • usesDictionaryOnEveryPage

      boolean usesDictionaryOnEveryPage()
      Returns:
      Whether this column chunk uses a dictionary-based encoding on every page.
    • getDictionarySupplier

      Function<SeekableChannelContext,org.apache.parquet.column.Dictionary> getDictionarySupplier()
      Returns:
      Supplier for a Parquet dictionary for this column chunk
      ApiNote:
      The result will never return null. It will instead supply NULL_DICTIONARY.
    • getType

      org.apache.parquet.schema.PrimitiveType getType()
    • getVersion

      @Nullable @Nullable String getVersion()
      Returns:
      The "version" string from deephaven specific parquet metadata, or null if it's not present.
    • getChannelsProvider

      SeekableChannelsProvider getChannelsProvider()
      Returns:
      The channel provider for this column chunk reader.