Class Float16

java.lang.Object
io.deephaven.extensions.barrage.util.Float16

public class Float16 extends Object
Lifted from Apache Arrow project: https://github.com/apache/arrow/blob/ee62d970338f173fff4c0d11b975fe30b5fda70b/java/memory/memory-core/src/main/java/org/apache/arrow/memory/util/Float16.java
    Changes made:
  • Keep Only the Method Used
  • Use floatToIntBits over floatToIntRawBits for GWT compilation
The class is a utility class to manipulate half-precision 16-bit IEEE 754 floating point data types (also called fp16 or binary16). A half-precision float can be created from or converted to single-precision floats, and is stored in a short data type. The IEEE 754 standard specifies an float16 as having the following format:
  • Sign bit: 1 bit
  • Exponent width: 5 bits
  • Significand: 10 bits

The format is laid out as follows:

 1   11111   1111111111
 ^   --^--   -----^----
 sign  |          |_______ significand
       |
      -- exponent
 
Half-precision floating points can be useful to save memory and/or bandwidth at the expense of range and precision when compared to single-precision floating points (float32). Ref: https://android.googlesource.com/platform/libcore/+/master/luni/src/main/java/libcore/util/FP16.java
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static float
    toFloat(short b)
    Converts the specified half-precision float value into a single-precision float value.
    static short
    toFloat16(float f)
    Converts the specified single-precision float value into a half-precision float value.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Float16

      public Float16()
  • Method Details

    • toFloat

      public static float toFloat(short b)
      Converts the specified half-precision float value into a single-precision float value. The following special cases are handled: If the input is NaN, the returned value is Float NaN. If the input is POSITIVE_INFINITY or NEGATIVE_INFINITY, the returned value is respectively Float POSITIVE_INFINITY or Float NEGATIVE_INFINITY. If the input is 0 (positive or negative), the returned value is +/-0.0f. Otherwise, the returned value is a normalized single-precision float value.
      Parameters:
      b - The half-precision float value to convert to single-precision
      Returns:
      A normalized single-precision float value
    • toFloat16

      public static short toFloat16(float f)
      Converts the specified single-precision float value into a half-precision float value. The following special cases are handled:

      If the input is NaN, the returned value is NaN. If the input is Float POSITIVE_INFINITY or Float NEGATIVE_INFINITY, the returned value is respectively POSITIVE_INFINITY or NEGATIVE_INFINITY. If the input is 0 (positive or negative), the returned value is POSITIVE_ZERO or NEGATIVE_ZERO. If the input is a less than MIN_VALUE, the returned value is flushed to POSITIVE_ZERO or NEGATIVE_ZERO. If the input is a less than MIN_NORMAL, the returned value is a denorm half-precision float. Otherwise, the returned value is rounded to the nearest representable half-precision float value.

      Parameters:
      f - The single-precision float value to convert to half-precision
      Returns:
      A half-precision float value