Interface TransferObject<BUFFER_TYPE>

Type Parameters:
BUFFER_TYPE - The type of the buffer to be written out to the Parquet file
All Superinterfaces:
AutoCloseable, SafeCloseable
All Known Implementing Classes:
ArrayAndVectorTransfer, DictEncodedStringArrayAndVectorTransfer

public interface TransferObject<BUFFER_TYPE> extends SafeCloseable
Classes that implement this interface are responsible for converting data from individual DH columns into buffers to be written out to the Parquet file.
  • Method Details

    • create

      static <DATA_TYPE> TransferObject<?> create(@NotNull @NotNull RowSet tableRowSet, @NotNull @NotNull ParquetInstructions instructions, @NotNull @NotNull Map<String,Map<ParquetCacheTags,Object>> computedCache, @NotNull @NotNull String columnName, @NotNull @NotNull ColumnSource<DATA_TYPE> columnSource)
    • createDictEncodedStringTransfer

      @NotNull static <DATA_TYPE> @NotNull TransferObject<IntBuffer> createDictEncodedStringTransfer(@NotNull @NotNull RowSet tableRowSet, @NotNull @NotNull ColumnSource<DATA_TYPE> columnSource, int targetPageSize, @NotNull @NotNull StringDictionary dictionary)
    • transferOnePageToBuffer

      int transferOnePageToBuffer()
      Transfer one page size worth of fetched data into an internal buffer, which can then be accessed using getBuffer(). The target page size is passed in the constructor. For dictionary encoded string transfers, this method also updates the dictionary with the strings encountered.
      Returns:
      The number of fetched data entries copied into the buffer. This can be different from the total number of entries fetched in case of variable-width types (e.g. strings) when used with additional page size limits while copying.
    • hasMoreDataToBuffer

      boolean hasMoreDataToBuffer()
      Check if there is any more data which can be copied into buffer
    • getBuffer

      BUFFER_TYPE getBuffer()
      Get the buffer suitable for writing to a Parquet file
      Returns:
      the buffer
    • pageHasNull

      default boolean pageHasNull()
      Returns whether we encountered any null value while transferring page data to buffer. This method is only used for dictionary encoded string transfer objects. This method should be called after transferOnePageToBuffer() and the state resets everytime we call transferOnePageToBuffer().
    • getRepeatCount

      default IntBuffer getRepeatCount()
      Get the lengths of array/vector elements added to the buffer.
      Returns:
      the buffer with counts