Class ParquetTableLocationKey

All Implemented Interfaces:
LogOutputAppendable, ImmutableTableLocationKey, TableLocationKey, NamedImplementation, Comparable<TableLocationKey>
Direct Known Subclasses:
IcebergTableParquetLocationKey

public class ParquetTableLocationKey extends URITableLocationKey
TableLocationKey implementation for use with data stored in the parquet format.
  • Constructor Details

    • ParquetTableLocationKey

      public ParquetTableLocationKey(@NotNull @NotNull URI parquetFileUri, int order, @Nullable @Nullable Map<String,Comparable<?>> partitions, @NotNull @NotNull ParquetInstructions readInstructions)
      Construct a new ParquetTableLocationKey for the supplied parquetFileUri and partitions.

      This constructor will create a new SeekableChannelsProvider for reading the file. If you have multiple location keys that should share a provider, use the other constructor and set the provider manually.

      Parameters:
      parquetFileUri - The parquet file that backs the keyed location. Will be adjusted to an absolute path.
      order - Explicit ordering index, taking precedence over other fields
      partitions - The table partitions enclosing the table location keyed by this. Note that if this parameter is null, the location will be a member of no partitions. An ordered copy of the map will be made, so the calling code is free to mutate the map after this call
      readInstructions - the instructions for customizations while reading
    • ParquetTableLocationKey

      public ParquetTableLocationKey(@NotNull @NotNull URI parquetFileUri, int order, @Nullable @Nullable Map<String,Comparable<?>> partitions, @NotNull @NotNull ParquetInstructions readInstructions, @NotNull @NotNull SeekableChannelsProvider channelsProvider)
      Construct a new ParquetTableLocationKey for the supplied parquetFileUri and partitions.
      Parameters:
      parquetFileUri - The parquet file that backs the keyed location. Will be adjusted to an absolute path.
      order - Explicit ordering index, taking precedence over other fields
      partitions - The table partitions enclosing the table location keyed by this. Note that if this parameter is null, the location will be a member of no partitions. An ordered copy of the map will be made, so the calling code is free to mutate the map after this call
      readInstructions - the instructions for customizations while reading
      channelsProvider - the provider for reading the file
  • Method Details

    • getImplementationName

      public String getImplementationName()
      Description copied from interface: NamedImplementation

      Get a name for the implementing class. Useful for abstract classes that implement LogOutputAppendable or override toString.

      The default implementation is correct, but not suitable for high-frequency usage.

      Specified by:
      getImplementationName in interface NamedImplementation
      Overrides:
      getImplementationName in class URITableLocationKey
      Returns:
      A name for the implementing class
    • getFileReader

      public ParquetFileReader getFileReader()
      Get a previously-set or on-demand created ParquetFileReader for this location key's file.
      Returns:
      A ParquetFileReader for this location key's file.
    • setFileReader

      public void setFileReader(ParquetFileReader fileReader)
      Set the ParquetFileReader that will be returned by getFileReader(). Pass null to force on-demand construction at the next invocation. Always clears cached ParquetMetadata and RowGroup indices.
      Parameters:
      fileReader - The new ParquetFileReader
    • getMetadata

      public org.apache.parquet.hadoop.metadata.ParquetMetadata getMetadata()
      Get a previously-set or on-demand created ParquetMetadata for this location key's file.
      Returns:
      A ParquetMetadata for this location key's file.
    • setMetadata

      public void setMetadata(org.apache.parquet.hadoop.metadata.ParquetMetadata metadata)
      Set the ParquetMetadata that will be returned by getMetadata() ()}. Pass null to force on-demand construction at the next invocation.
      Parameters:
      metadata - The new ParquetMetadata
    • getRowGroupIndices

      public int[] getRowGroupIndices()
      Get previously-set or on-demand created RowGroup indices for this location key's current ParquetFileReader.
      Returns:
      RowGroup indices for this location key's current ParquetFileReader.
    • setRowGroupIndices

      public void setRowGroupIndices(int[] rowGroupIndices)
      Set the RowGroup indices that will be returned by getRowGroupIndices()
      Parameters:
      rowGroupIndices - The new RowGroup indices