@InterfaceAudience.Public public class OrderedBytes extends Object
Bytes, these methods produce byte arrays which maintain the sort
 order of the original values.
 Each value is encoded as one or more bytes. The first byte of the encoding, its meaning, and a terse description of the bytes that follow is given by the following table:
| Content Type | Encoding | 
|---|---|
| NULL | 0x05 | 
| negative infinity | 0x07 | 
| negative large | 0x08, ~E, ~M | 
| negative medium | 0x13-E, ~M | 
| negative small | 0x14, -E, ~M | 
| zero | 0x15 | 
| positive small | 0x16, ~-E, M | 
| positive medium | 0x17+E, M | 
| positive large | 0x22, E, M | 
| positive infinity | 0x23 | 
| NaN | 0x25 | 
| fixed-length 32-bit integer | 0x27, I | 
| fixed-length 64-bit integer | 0x28, I | 
| fixed-length 8-bit integer | 0x29 | 
| fixed-length 16-bit integer | 0x2a | 
| fixed-length 32-bit float | 0x30, F | 
| fixed-length 64-bit float | 0x31, F | 
| TEXT | 0x33, T | 
| variable length BLOB | 0x35, B | 
| byte-for-byte BLOB | 0x36, X | 
Each value that is a NULL encodes as a single byte of 0x05. Since every other value encoding begins with a byte greater than 0x05, this forces NULL values to sort first.
Each text value begins with a single byte of 0x33 and ends with a single byte of 0x00. There are zero or more intervening bytes that encode the text value. The intervening bytes are chosen so that the encoding will sort in the desired collating order. The intervening bytes may not contain a 0x00 character; the only 0x00 byte allowed in a text encoding is the final byte.
The text encoding ends in 0x00 in order to ensure that when there are two strings where one is a prefix of the other that the shorter string will sort first.
There are two encoding strategies for binary fields, referred to as "BlobVar" and "BlobCopy". BlobVar is less efficient in both space and encoding time. It has no limitations on the range of encoded values. BlobCopy is a byte-for-byte copy of the input data followed by a termination byte. It is extremely fast to encode and decode. It carries the restriction of not allowing a 0x00 value in the input byte[] as this value is used as the termination byte.
 "BlobVar" encodes the input byte[] in a manner similar to a variable length
 integer encoding. As with the other OrderedBytes encodings,
 the first encoded byte is used to indicate what kind of value follows. This
 header byte is 0x37 for BlobVar encoded values. As with the traditional
 varint encoding, the most significant bit of each subsequent encoded
 byte is used as a continuation marker. The 7 remaining bits
 contain the 7 most significant bits of the first unencoded byte. The next
 encoded byte starts with a continuation marker in the MSB. The least
 significant bit from the first unencoded byte follows, and the remaining 6
 bits contain the 6 MSBs of the second unencoded byte. The encoding
 continues, encoding 7 bytes on to 8 encoded bytes. The MSB of the final
 encoded byte contains a termination marker rather than a continuation
 marker, and any remaining bits from the final input byte. Any trailing bits
 in the final encoded byte are zeros.
 
"BlobCopy" is a simple byte-for-byte copy of the input data. It uses 0x38 as the header byte, and is terminated by 0x00 in the DESCENDING case. This alternative encoding is faster and more space-efficient, but it cannot accept values containing a 0x00 byte in DESCENDING order.
 Numeric values must be coded so as to sort in numeric order. We assume that
 numeric values can be both integer and floating point values. Clients must
 be careful to use inspection methods for encoded values (such as
 isNumericInfinite(PositionedByteRange) and
 isNumericNaN(PositionedByteRange) to protect against decoding
 values into object which do not support these numeric concepts (such as
 Long and BigDecimal).
 
Simplest cases first: If the numeric value is a NaN, then the encoding is a single byte of 0x25. This causes NaN values to sort after every other numeric value.
If the numeric value is a negative infinity then the encoding is a single byte of 0x07. Since every other numeric value except NaN has a larger initial byte, this encoding ensures that negative infinity will sort prior to every other numeric value other than NaN.
If the numeric value is a positive infinity then the encoding is a single byte of 0x23. Every other numeric value encoding begins with a smaller byte, ensuring that positive infinity always sorts last among numeric values. 0x23 is also smaller than 0x33, the initial byte of a text value, ensuring that every numeric value sorts before every text value.
If the numeric value is exactly zero then it is encoded as a single byte of 0x15. Finite negative values will have initial bytes of 0x08 through 0x14 and finite positive values will have initial bytes of 0x16 through 0x22.
For all numeric values, we compute a mantissa M and an exponent E. The mantissa is a base-100 representation of the value. The exponent E determines where to put the decimal point.
Each centimal digit of the mantissa is stored in a byte. If the value of the centimal digit is X (hence X≥0 and X≤99) then the byte value will be 2*X+1 for every byte of the mantissa, except for the last byte which will be 2*X+0. The mantissa must be the minimum number of bytes necessary to represent the value; trailing X==0 digits are omitted. This means that the mantissa will never contain a byte with the value 0x00.
If we assume all digits of the mantissa occur to the right of the decimal point, then the exponent E is the power of one hundred by which one must multiply the mantissa to recover the original value.
Values are classified as large, medium, or small according to the value of E. If E is 11 or more, the value is large. For E between 0 and 10, the value is medium. For E less than zero, the value is small.
Large positive values are encoded as a single byte 0x22 followed by E as a varint and then M. Medium positive values are a single byte of 0x17+E followed by M. Small positive values are encoded as a single byte 0x16 followed by the ones-complement of the varint for -E followed by M.
Small negative values are encoded as a single byte 0x14 followed by -E as a varint and then the ones-complement of M. Medium negative values are encoded as a byte 0x13-E followed by the ones-complement of M. Large negative values consist of the single byte 0x08 followed by the ones-complement of the varint encoding of E followed by the ones-complement of M.
All 4-byte integers are serialized to a 5-byte, fixed-width, sortable byte format. All 8-byte integers are serialized to the equivelant 9-byte format. Serialization is performed by writing a header byte, inverting the integer sign bit and writing the resulting bytes to the byte array in big endian order.
32-bit and 64-bit floating point numbers are encoded to a 5-byte and 9-byte encoding format, respectively. The format is identical, save for the precision respected in each step of the operation.
This format ensures the following total ordering of floating point values: Float.NEGATIVE_INFINITY < -Float.MAX_VALUE < ... < -Float.MIN_VALUE < -0.0 < +0.0; < Float.MIN_VALUE < ... < Float.MAX_VALUE < Float.POSITIVE_INFINITY < Float.NaN
Floating point numbers are encoded as specified in IEEE 754. A 32-bit single precision float consists of a sign bit, 8-bit unsigned exponent encoded in offset-127 notation, and a 23-bit significand. The format is described further in the Single Precision Floating Point Wikipedia page
The value of a normal float is -1 sign bit × 2exponent - 127 × 1.significand
The IEE754 floating point format already preserves sort ordering for positive floating point numbers when the raw bytes are compared in most significant byte order. This is discussed further at http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
Thus, we need only ensure that negative numbers sort in the the exact opposite order as positive numbers (so that say, negative infinity is less than negative 1), and that all negative numbers compare less than any positive number. To accomplish this, we invert the sign bit of all floating point numbers, and we also invert the exponent and significand bits if the floating point number was negative.
 More specifically, we first store the floating point bits into a 32-bit int
 j using Float.floatToIntBits(float). This method collapses
 all NaNs into a single, canonical NaN value but otherwise leaves the bits
 unchanged. We then compute
 
j ˆ= (j >> (Integer.SIZE - 1)) | Integer.MIN_SIZE
 which inverts the sign bit and XOR's all other bits with the sign bit
 itself. Comparing the raw bytes of j in most significant byte
 order is equivalent to performing a single precision floating point
 comparison on the underlying bits (ignoring NaN comparisons, as NaNs don't
 compare equal to anything when performing floating point comparisons).
 
The resulting integer is then converted into a byte array by serializing the integer one byte at a time in most significant byte order. The serialized integer is prefixed by a single header byte. All serialized values are 5 bytes in length.
 OrderedBytes encodings are heavily influenced by the
 SQLite4 Key
 Encoding. Slight deviations are make in the interest of order
 correctness and user extensibility. Fixed-width Long and
 Double encodings are based on implementations from the now defunct
 Orderly library.
 
| Modifier and Type | Field and Description | 
|---|---|
| private static byte | BLOB_COPY | 
| private static byte | BLOB_VAR | 
| static MathContext | DEFAULT_MATH_CONTEXTThe context used to normalize  BigDecimalvalues. | 
| private static BigDecimal | E32 | 
| private static BigDecimal | E8 | 
| private static BigDecimal | EN10 | 
| private static BigDecimal | EN2 | 
| private static byte | FIXED_FLOAT32 | 
| private static byte | FIXED_FLOAT64 | 
| private static byte | FIXED_INT16 | 
| private static byte | FIXED_INT32 | 
| private static byte | FIXED_INT64 | 
| private static byte | FIXED_INT8 | 
| static int | MAX_PRECISIONMax precision guaranteed to fit into a  long. | 
| private static byte | NAN | 
| private static byte | NEG_INF | 
| private static byte | NEG_LARGE | 
| private static byte | NEG_MED_MAX | 
| private static byte | NEG_MED_MIN | 
| private static byte | NEG_SMALL | 
| private static byte | NULL | 
| private static byte | POS_INF | 
| private static byte | POS_LARGE | 
| private static byte | POS_MED_MAX | 
| private static byte | POS_MED_MIN | 
| private static byte | POS_SMALL | 
| private static byte | TERM | 
| private static byte | TEXT | 
| static Charset | UTF8 | 
| private static byte | ZERO | 
| Constructor and Description | 
|---|
| OrderedBytes() | 
| Modifier and Type | Method and Description | 
|---|---|
| (package private) static int | blobVarDecodedLength(int len)Calculate the expected BlobVar decoded length based on encoded length. | 
| static int | blobVarEncodedLength(int len)Calculate the expected BlobVar encoded length based on unencoded length. | 
| static byte[] | decodeBlobCopy(PositionedByteRange src)Decode a Blob value, byte-for-byte copy. | 
| static byte[] | decodeBlobVar(PositionedByteRange src)Decode a blob value that was encoded using BlobVar encoding. | 
| static float | decodeFloat32(PositionedByteRange src)Decode a 32-bit floating point value using the fixed-length encoding. | 
| static double | decodeFloat64(PositionedByteRange src)Decode a 64-bit floating point value using the fixed-length encoding. | 
| static short | decodeInt16(PositionedByteRange src)Decode an  int16value. | 
| static int | decodeInt32(PositionedByteRange src)Decode an  int32value. | 
| static long | decodeInt64(PositionedByteRange src)Decode an  int64value. | 
| static byte | decodeInt8(PositionedByteRange src)Decode an  int8value. | 
| static BigDecimal | decodeNumericAsBigDecimal(PositionedByteRange src)Decode a  BigDecimalvalue from the variable-length encoding. | 
| static double | decodeNumericAsDouble(PositionedByteRange src)Decode a primitive  doublevalue from the Numeric encoding. | 
| static long | decodeNumericAsLong(PositionedByteRange src)Decode a primitive  longvalue from the Numeric encoding. | 
| private static BigDecimal | decodeNumericValue(PositionedByteRange src)Decode a  BigDecimalfromsrc. | 
| private static BigDecimal | decodeSignificand(PositionedByteRange src,
                 int e,
                 boolean comp)Read significand digits from  srcaccording to the magnitude
 ofe. | 
| static String | decodeString(PositionedByteRange src)Decode a String value. | 
| static int | encodeBlobCopy(PositionedByteRange dst,
              byte[] val,
              int voff,
              int vlen,
              Order ord)Encode a Blob value as a byte-for-byte copy. | 
| static int | encodeBlobCopy(PositionedByteRange dst,
              byte[] val,
              Order ord)Encode a Blob value as a byte-for-byte copy. | 
| static int | encodeBlobVar(PositionedByteRange dst,
             byte[] val,
             int voff,
             int vlen,
             Order ord)Encode a Blob value using a modified varint encoding scheme. | 
| static int | encodeBlobVar(PositionedByteRange dst,
             byte[] val,
             Order ord)Encode a blob value using a modified varint encoding scheme. | 
| static int | encodeFloat32(PositionedByteRange dst,
             float val,
             Order ord)Encode a 32-bit floating point value using the fixed-length encoding. | 
| static int | encodeFloat64(PositionedByteRange dst,
             double val,
             Order ord)Encode a 64-bit floating point value using the fixed-length encoding. | 
| static int | encodeInt16(PositionedByteRange dst,
           short val,
           Order ord)Encode an  int16value using the fixed-length encoding. | 
| static int | encodeInt32(PositionedByteRange dst,
           int val,
           Order ord)Encode an  int32value using the fixed-length encoding. | 
| static int | encodeInt64(PositionedByteRange dst,
           long val,
           Order ord)Encode an  int64value using the fixed-length encoding. | 
| static int | encodeInt8(PositionedByteRange dst,
          byte val,
          Order ord)Encode an  int8value using the fixed-length encoding. | 
| static int | encodeNull(PositionedByteRange dst,
          Order ord)Encode a null value. | 
| static int | encodeNumeric(PositionedByteRange dst,
             BigDecimal val,
             Order ord)Encode a numerical value using the variable-length encoding. | 
| static int | encodeNumeric(PositionedByteRange dst,
             double val,
             Order ord)Encode a numerical value using the variable-length encoding. | 
| static int | encodeNumeric(PositionedByteRange dst,
             long val,
             Order ord)Encode a numerical value using the variable-length encoding. | 
| private static int | encodeNumericLarge(PositionedByteRange dst,
                  BigDecimal val)Encode the large magnitude floating point number  valusing
 the key encoding. | 
| private static int | encodeNumericSmall(PositionedByteRange dst,
                  BigDecimal val)
 Encode the small magnitude floating point number  valusing the
 key encoding. | 
| static int | encodeString(PositionedByteRange dst,
            String val,
            Order ord)Encode a String value. | 
| (package private) static long | getVaruint64(PositionedByteRange src,
            boolean comp)Decode a sequence of bytes in  srcas a varuint64. | 
| static boolean | isBlobCopy(PositionedByteRange src)Return true when the next encoded value in  srcuses BlobCopy
 encoding, false otherwise. | 
| static boolean | isBlobVar(PositionedByteRange src)Return true when the next encoded value in  srcuses BlobVar
 encoding, false otherwise. | 
| static boolean | isEncodedValue(PositionedByteRange src)Returns true when  srcappears to be positioned an encoded value,
 false otherwise. | 
| static boolean | isFixedFloat32(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Float32 encoding, false otherwise. | 
| static boolean | isFixedFloat64(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Float64 encoding, false otherwise. | 
| static boolean | isFixedInt16(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Int16 encoding, false otherwise. | 
| static boolean | isFixedInt32(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Int32 encoding, false otherwise. | 
| static boolean | isFixedInt64(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Int64 encoding, false otherwise. | 
| static boolean | isFixedInt8(PositionedByteRange src)Return true when the next encoded value in  srcuses fixed-width
 Int8 encoding, false otherwise. | 
| static boolean | isNull(PositionedByteRange src)Return true when the next encoded value in  srcis null, false
 otherwise. | 
| static boolean | isNumeric(PositionedByteRange src)Return true when the next encoded value in  srcuses Numeric
 encoding, false otherwise. | 
| static boolean | isNumericInfinite(PositionedByteRange src)Return true when the next encoded value in  srcuses Numeric
 encoding and isInfinite, false otherwise. | 
| static boolean | isNumericNaN(PositionedByteRange src)Return true when the next encoded value in  srcuses Numeric
 encoding and isNaN, false otherwise. | 
| static boolean | isNumericZero(PositionedByteRange src)Return true when the next encoded value in  srcuses Numeric
 encoding and is0, false otherwise. | 
| static boolean | isText(PositionedByteRange src)Return true when the next encoded value in  srcuses Text encoding,
 false otherwise. | 
| static int | length(PositionedByteRange buff)Return the number of encoded entries remaining in  buff. | 
| (package private) static int | lengthVaruint64(PositionedByteRange src,
               boolean comp)Inspect  srcfor an encoded varuint64 for its length in bytes. | 
| (package private) static BigDecimal | normalize(BigDecimal val)Strip all trailing zeros to ensure that no digit will be zero and round
 using our default context to ensure precision doesn't exceed max allowed. | 
| private static int | putUint32(PositionedByteRange dst,
         int val)Write a 32-bit unsigned integer to  dstas 4 big-endian bytes. | 
| (package private) static int | putVaruint64(PositionedByteRange dst,
            long val,
            boolean comp)Encode an unsigned 64-bit unsigned integer  valintodst. | 
| static int | skip(PositionedByteRange src)Skip  buff's position forward over one encoded value. | 
| private static int | skipSignificand(PositionedByteRange src,
               boolean comp)Skip  srcover the significand bytes. | 
| (package private) static int | skipVaruint64(PositionedByteRange src,
             boolean cmp)Skip  srcover the encoded varuint64. | 
| private static IllegalArgumentException | unexpectedHeader(byte header)Creates the standard exception when the encoded header byte is unexpected for the decoding
 context. | 
| private static int | unsignedCmp(long x1,
           long x2)Perform unsigned comparison between two long values. | 
private static final byte NULL
private static final byte NEG_INF
private static final byte NEG_LARGE
private static final byte NEG_MED_MIN
private static final byte NEG_MED_MAX
private static final byte NEG_SMALL
private static final byte ZERO
private static final byte POS_SMALL
private static final byte POS_MED_MIN
private static final byte POS_MED_MAX
private static final byte POS_LARGE
private static final byte POS_INF
private static final byte NAN
private static final byte FIXED_INT8
private static final byte FIXED_INT16
private static final byte FIXED_INT32
private static final byte FIXED_INT64
private static final byte FIXED_FLOAT32
private static final byte FIXED_FLOAT64
private static final byte TEXT
private static final byte BLOB_VAR
private static final byte BLOB_COPY
private static final byte TERM
private static final BigDecimal E8
private static final BigDecimal E32
private static final BigDecimal EN2
private static final BigDecimal EN10
public static final int MAX_PRECISION
long.public static final MathContext DEFAULT_MATH_CONTEXT
BigDecimal values.public OrderedBytes()
private static IllegalArgumentException unexpectedHeader(byte header)
header - value used in error message.private static int unsignedCmp(long x1, long x2)
CellComparator.private static int putUint32(PositionedByteRange dst, int val)
dst as 4 big-endian bytes.@InterfaceAudience.Private static int putVaruint64(PositionedByteRange dst, long val, boolean comp)
val into dst.dst - The destination to which encoded bytes are written.val - The value to write.comp - Compliment the encoded value when comp is true.@InterfaceAudience.Private static int lengthVaruint64(PositionedByteRange src, boolean comp)
src for an encoded varuint64 for its length in bytes.
 Preserves the state of src.src - source buffercomp - if true, parse the compliment of the value.@InterfaceAudience.Private static int skipVaruint64(PositionedByteRange src, boolean cmp)
src over the encoded varuint64.src - source buffercmp - if true, parse the compliment of the value.@InterfaceAudience.Private static long getVaruint64(PositionedByteRange src, boolean comp)
src as a varuint64. Compliment the
 encoded value when comp is true.@InterfaceAudience.Private static BigDecimal normalize(BigDecimal val)
NumberUtil.BigDecimal instanceprivate static BigDecimal decodeSignificand(PositionedByteRange src, int e, boolean comp)
src according to the magnitude
 of e.src - The source from which to read encoded digits.e - The magnitude of the first digit read.comp - Treat encoded bytes as compliments when comp is true.IllegalArgumentException - when read exceeds the remaining length
     of src.private static int skipSignificand(PositionedByteRange src, boolean comp)
src over the significand bytes.src - The source from which to read encoded digits.comp - Treat encoded bytes as compliments when comp is true.private static int encodeNumericSmall(PositionedByteRange dst, BigDecimal val)
 Encode the small magnitude floating point number val using the
 key encoding. The caller guarantees that 1.0 > abs(val) > 0.0.
 
 A floating point value is encoded as an integer exponent E and a
 mantissa M. The original value is equal to (M * 100^E).
 E is set to the smallest value possible without making M
 greater than or equal to 1.0.
 
 For this routine, E will always be zero or negative, since the
 original value is less than one. The encoding written by this routine is
 the ones-complement of the varint of the negative of E followed
 by the mantissa:
 
Encoding: ~-E M
dst - The destination to which encoded digits are written.val - The value to encode.private static int encodeNumericLarge(PositionedByteRange dst, BigDecimal val)
val using
 the key encoding. The caller guarantees that val will be
 finite and abs(val) >= 1.0.
 
 A floating point value is encoded as an integer exponent E
 and a mantissa M. The original value is equal to
 (M * 100^E). E is set to the smallest value
 possible without making M greater than or equal to 1.0.
 
 Each centimal digit of the mantissa is stored in a byte. If the value of
 the centimal digit is X (hence X>=0 and
 X<=99) then the byte value will be 2*X+1 for
 every byte of the mantissa, except for the last byte which will be
 2*X+0. The mantissa must be the minimum number of bytes
 necessary to represent the value; trailing X==0 digits are
 omitted. This means that the mantissa will never contain a byte with the
 value 0x00.
 
 If E > 10, then this routine writes of E as a
 varint followed by the mantissa as described above. Otherwise, if
 E <= 10, this routine only writes the mantissa and leaves
 the E value to be encoded as part of the opening byte of the
 field by the calling function.
 
   Encoding:  M       (if E<=10)
              E M     (if E>10)
 
 dst - The destination to which encoded digits are written.val - The value to encode.public static int encodeNumeric(PositionedByteRange dst, long val, Order ord)
dst - The destination to which encoded digits are written.val - The value to encode.ord - The Order to respect while encoding val.public static int encodeNumeric(PositionedByteRange dst, double val, Order ord)
dst - The destination to which encoded digits are written.val - The value to encode.ord - The Order to respect while encoding val.public static int encodeNumeric(PositionedByteRange dst, BigDecimal val, Order ord)
dst - The destination to which encoded digits are written.val - The value to encode.ord - The Order to respect while encoding val.private static BigDecimal decodeNumericValue(PositionedByteRange src)
BigDecimal from src. Assumes src encodes
 a value in Numeric encoding and is within the valid range of
 BigDecimal values. BigDecimal does not support NaN
 or Infinte values.public static double decodeNumericAsDouble(PositionedByteRange src)
double value from the Numeric encoding. Numeric
 encoding is based on BigDecimal; in the event the encoded value is
 larger than can be represented in a double, this method performs
 an implicit narrowing conversion as described in
 BigDecimal.doubleValue().NullPointerException - when the encoded value is NULL.IllegalArgumentException - when the encoded value is not a Numeric.encodeNumeric(PositionedByteRange, double, Order), 
BigDecimal.doubleValue()public static long decodeNumericAsLong(PositionedByteRange src)
long value from the Numeric encoding. Numeric
 encoding is based on BigDecimal; in the event the encoded value is
 larger than can be represented in a long, this method performs an
 implicit narrowing conversion as described in
 BigDecimal.doubleValue().NullPointerException - when the encoded value is NULL.IllegalArgumentException - when the encoded value is not a Numeric.encodeNumeric(PositionedByteRange, long, Order), 
BigDecimal.longValue()public static BigDecimal decodeNumericAsBigDecimal(PositionedByteRange src)
BigDecimal value from the variable-length encoding.IllegalArgumentException - when the encoded value is not a Numeric.encodeNumeric(PositionedByteRange, BigDecimal, Order)public static int encodeString(PositionedByteRange dst, String val, Order ord)