@InterfaceAudience.Private public class HFileBlock extends Object implements Cacheable
HFile
version 1 and 2 blocks, and writing version 2 blocks.
HFile
's compression algorithm, with a type-specific
magic record stored in the beginning of the compressed data (i.e. one needs
to uncompress the compressed block to determine the block type). There is
only a single compression algorithm setting for all blocks. Offset and size
information from the block index are required to read a block.
HFile
, similarly to what was done in
version 1.
Modifier and Type | Class and Description |
---|---|
private static class |
HFileBlock.AbstractFSReader
A common implementation of some methods of
HFileBlock.FSReader and some
tools for implementing HFile format version-specific block readers. |
static interface |
HFileBlock.BlockIterator
An interface allowing to iterate
HFileBlock s. |
static interface |
HFileBlock.BlockWritable
Something that can be written into a block.
|
static interface |
HFileBlock.FSReader
A full-fledged reader with iteration ability.
|
(package private) static class |
HFileBlock.FSReaderImpl
Reads version 2 blocks from the filesystem.
|
(package private) static class |
HFileBlock.Header |
private static class |
HFileBlock.PrefetchedHeader
We always prefetch the header of the next block, so that we know its
on-disk size in advance and can read it in one operation.
|
static class |
HFileBlock.Writer
Unified version 2
HFile block writer. |
Modifier and Type | Field and Description |
---|---|
(package private) static CacheableDeserializer<Cacheable> |
blockDeserializer |
private BlockType |
blockType
Type of block.
|
private ByteBuffer |
buf
The in-memory representation of the hfile block
|
static int |
BYTE_BUFFER_HEAP_SIZE |
(package private) static int |
CHECKSUM_SIZE
Each checksum value is an integer that can be stored in 4 bytes.
|
(package private) static int |
CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
On a checksum failure on a Reader, these many suceeding read
requests switch back to using hdfs checksums before auto-reenabling
hbase checksum verification.
|
private static int |
deserializerIdentifier |
static boolean |
DONT_FILL_HEADER |
(package private) static byte[] |
DUMMY_HEADER_NO_CHECKSUM |
static int |
ENCODED_HEADER_SIZE
The size of block header when blockType is
BlockType.ENCODED_DATA . |
static int |
EXTRA_SERIALIZATION_SPACE |
private HFileContext |
fileContext
Meta data that holds meta information on the hfileblock
|
static boolean |
FILL_HEADER |
private int |
nextBlockOnDiskSizeWithHeader
The on-disk size of the next block, including the header, obtained by
peeking into the first
HConstants.HFILEBLOCK_HEADER_SIZE bytes of the next block's
header, or -1 if unknown. |
private long |
offset
The offset of this block in the file.
|
private int |
onDiskDataSizeWithHeader
Size on disk of header + data.
|
private int |
onDiskSizeWithoutHeader
Size on disk excluding header, including checksum.
|
private long |
prevBlockOffset
The offset of the previous block on disk.
|
private int |
uncompressedSizeWithoutHeader
Size of pure data.
|
Constructor and Description |
---|
HFileBlock(BlockType blockType,
int onDiskSizeWithoutHeader,
int uncompressedSizeWithoutHeader,
long prevBlockOffset,
ByteBuffer buf,
boolean fillHeader,
long offset,
int onDiskDataSizeWithHeader,
HFileContext fileContext)
Creates a new
HFile block from the given fields. |
HFileBlock(ByteBuffer b,
boolean usesHBaseChecksum)
Creates a block from an existing buffer starting with a header.
|
HFileBlock(HFileBlock that)
Copy constructor.
|
Modifier and Type | Method and Description |
---|---|
private void |
allocateBuffer()
Always allocates a new buffer of the correct size.
|
boolean |
equals(Object comparison) |
void |
expectType(BlockType expectedType) |
BlockType |
getBlockType() |
ByteBuffer |
getBufferReadOnly()
Returns the buffer this block stores internally.
|
ByteBuffer |
getBufferReadOnlyWithHeader()
Returns the buffer of this block, including header data.
|
(package private) ByteBuffer |
getBufferWithHeader()
Returns a byte buffer of this block, including header data and checksum, positioned at
the beginning of header.
|
ByteBuffer |
getBufferWithoutHeader()
Returns a buffer that does not include the header or checksum.
|
(package private) int |
getBytesPerChecksum() |
DataInputStream |
getByteStream() |
(package private) byte |
getChecksumType() |
DataBlockEncoding |
getDataBlockEncoding() |
short |
getDataBlockEncodingId() |
CacheableDeserializer<Cacheable> |
getDeserializer()
Returns CacheableDeserializer instance which reconstructs original object from ByteBuffer.
|
byte[] |
getDummyHeaderForVersion()
Return the appropriate DUMMY_HEADER for the minor version
|
private static byte[] |
getDummyHeaderForVersion(boolean usesHBaseChecksum)
Return the appropriate DUMMY_HEADER for the minor version
|
HFileContext |
getHFileContext() |
int |
getNextBlockOnDiskSizeWithHeader() |
long |
getOffset() |
(package private) int |
getOnDiskDataSizeWithHeader() |
int |
getOnDiskSizeWithHeader() |
int |
getOnDiskSizeWithoutHeader() |
long |
getPrevBlockOffset() |
int |
getSerializedLength()
Returns the length of the ByteBuffer required to serialized the object.
|
int |
getUncompressedSizeWithoutHeader() |
private boolean |
hasNextBlockHeader()
Return true when this buffer includes next block's header.
|
int |
headerSize()
Returns the size of this block header.
|
static int |
headerSize(boolean usesHBaseChecksum)
Maps a minor version to the size of the header.
|
long |
heapSize() |
boolean |
isUnpacked()
Return true when this block's buffer has been unpacked, false otherwise.
|
private void |
overwriteHeader()
Rewinds
buf and writes first 4 header fields. |
(package private) static boolean |
positionalReadWithExtra(org.apache.hadoop.fs.FSDataInputStream in,
long position,
byte[] buf,
int bufOffset,
int necessaryLen,
int extraLen)
Read from an input stream.
|
static boolean |
readWithExtra(InputStream in,
byte[] buf,
int bufOffset,
int necessaryLen,
int extraLen)
Read from an input stream.
|
(package private) void |
sanityCheck()
Checks if the block is internally consistent, i.e.
|
private void |
sanityCheckAssertion(BlockType valueFromBuf,
BlockType valueFromField) |
private void |
sanityCheckAssertion(long valueFromBuf,
long valueFromField,
String fieldName) |
void |
serialize(ByteBuffer destination)
Serializes its data into destination.
|
void |
serializeExtraInfo(ByteBuffer destination) |
String |
toString() |
(package private) static String |
toStringHeader(ByteBuffer buf)
Convert the contents of the block header into a human readable string.
|
(package private) int |
totalChecksumBytes()
Calculate the number of bytes required to store all the checksums for this block.
|
private static int |
totalChecksumBytes(HFileContext fileContext,
int onDiskDataSizeWithHeader) |
(package private) HFileBlock |
unpack(HFileContext fileContext,
HFileBlock.FSReader reader)
Retrieves the decompressed/decrypted view of this block.
|
private static void |
validateOnDiskSizeWithoutHeader(int expectedOnDiskSizeWithoutHeader,
int actualOnDiskSizeWithoutHeader,
ByteBuffer buf,
long offset)
Called after reading a block with provided onDiskSizeWithHeader.
|
static void |
verifyUncompressed(ByteBuffer buf,
boolean useHBaseChecksum)
An additional sanity-check in case no compression or encryption is being used.
|
static final int CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
public static final boolean FILL_HEADER
public static final boolean DONT_FILL_HEADER
public static final int ENCODED_HEADER_SIZE
BlockType.ENCODED_DATA
.
This extends normal header by adding the id of encoder.static final byte[] DUMMY_HEADER_NO_CHECKSUM
public static final int BYTE_BUFFER_HEAP_SIZE
public static final int EXTRA_SERIALIZATION_SPACE
static final int CHECKSUM_SIZE
static final CacheableDeserializer<Cacheable> blockDeserializer
private static final int deserializerIdentifier
private BlockType blockType
private int onDiskSizeWithoutHeader
private final int uncompressedSizeWithoutHeader
private final long prevBlockOffset
private final int onDiskDataSizeWithHeader
onDiskSizeWithoutHeader
when using HDFS checksum.private ByteBuffer buf
private HFileContext fileContext
private long offset
private int nextBlockOnDiskSizeWithHeader
HConstants.HFILEBLOCK_HEADER_SIZE
bytes of the next block's
header, or -1 if unknown.HFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuffer buf, boolean fillHeader, long offset, int onDiskDataSizeWithHeader, HFileContext fileContext)
HFile
block from the given fields. This constructor
is mostly used when the block data has already been read and uncompressed,
and is sitting in a byte buffer.blockType
- the type of this block, see BlockType
onDiskSizeWithoutHeader
- see onDiskSizeWithoutHeader
uncompressedSizeWithoutHeader
- see uncompressedSizeWithoutHeader
prevBlockOffset
- see prevBlockOffset
buf
- block header (HConstants.HFILEBLOCK_HEADER_SIZE
bytes) followed by
uncompressed data. ThisfillHeader
- when true, parse buf
and override the first 4 header fields.offset
- the file offset the block was read fromonDiskDataSizeWithHeader
- see onDiskDataSizeWithHeader
fileContext
- HFile meta dataHFileBlock(HFileBlock that)
that
's buffer.HFileBlock(ByteBuffer b, boolean usesHBaseChecksum) throws IOException
IOException
public BlockType getBlockType()
getBlockType
in interface Cacheable
public short getDataBlockEncodingId()
public int getOnDiskSizeWithHeader()
public int getOnDiskSizeWithoutHeader()
public int getUncompressedSizeWithoutHeader()
public long getPrevBlockOffset()
private void overwriteHeader()
buf
and writes first 4 header fields. buf
position
is modified as side-effect.public ByteBuffer getBufferWithoutHeader()
public ByteBuffer getBufferReadOnly()
CompoundBloomFilter
to avoid object creation on every Bloom filter lookup, but has to
be used with caution. Checksum data is not included in the returned
buffer but header data is.public ByteBuffer getBufferReadOnlyWithHeader()
BucketCache
to avoid buffer copy.ByteBuffer getBufferWithHeader()
private void sanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) throws IOException
IOException
private void sanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) throws IOException
IOException
void sanityCheck() throws IOException
HConstants.HFILEBLOCK_HEADER_SIZE
bytes of the buffer contain a
valid header consistent with the fields. Assumes a packed block structure.
This function is primary for testing and debugging, and is not
thread-safe, because it alters the internal buffer pointer.IOException
private static void validateOnDiskSizeWithoutHeader(int expectedOnDiskSizeWithoutHeader, int actualOnDiskSizeWithoutHeader, ByteBuffer buf, long offset) throws IOException
IOException
HFileBlock unpack(HFileContext fileContext, HFileBlock.FSReader reader) throws IOException
IOException
private boolean hasNextBlockHeader()
private void allocateBuffer()
public boolean isUnpacked()
public static void verifyUncompressed(ByteBuffer buf, boolean useHBaseChecksum) throws IOException
IOException
public void expectType(BlockType expectedType) throws IOException
expectedType
- the expected type of this blockIOException
- if this block's type is different than expectedpublic long getOffset()
public DataInputStream getByteStream()
public long heapSize()
public static boolean readWithExtra(InputStream in, byte[] buf, int bufOffset, int necessaryLen, int extraLen) throws IOException
IOUtils.readFully(InputStream, byte[], int, int)
, but specifies a
number of "extra" bytes that would be desirable but not absolutely
necessary to read.in
- the input stream to read frombuf
- the buffer to read intobufOffset
- the destination offset in the buffernecessaryLen
- the number of bytes that are absolutely necessary to
readextraLen
- the number of extra bytes that would be nice to readIOException
- if failed to read the necessary bytesstatic boolean positionalReadWithExtra(org.apache.hadoop.fs.FSDataInputStream in, long position, byte[] buf, int bufOffset, int necessaryLen, int extraLen) throws IOException
IOUtils.readFully(InputStream, byte[], int, int)
, but uses
positional read and specifies a number of "extra" bytes that would be
desirable but not absolutely necessary to read.in
- the input stream to read fromposition
- the position within the stream from which to start readingbuf
- the buffer to read intobufOffset
- the destination offset in the buffernecessaryLen
- the number of bytes that are absolutely necessary to
readextraLen
- the number of extra bytes that would be nice to readIOException
- if failed to read the necessary bytespublic int getNextBlockOnDiskSizeWithHeader()
public int getSerializedLength()
Cacheable
getSerializedLength
in interface Cacheable
public void serialize(ByteBuffer destination)
Cacheable
public void serializeExtraInfo(ByteBuffer destination)
public CacheableDeserializer<Cacheable> getDeserializer()
Cacheable
getDeserializer
in interface Cacheable
public DataBlockEncoding getDataBlockEncoding()
byte getChecksumType()
int getBytesPerChecksum()
int getOnDiskDataSizeWithHeader()
int totalChecksumBytes()
CHECKSUM_SIZE
).private static int totalChecksumBytes(HFileContext fileContext, int onDiskDataSizeWithHeader)
public int headerSize()
public static int headerSize(boolean usesHBaseChecksum)
public byte[] getDummyHeaderForVersion()
private static byte[] getDummyHeaderForVersion(boolean usesHBaseChecksum)
public HFileContext getHFileContext()
static String toStringHeader(ByteBuffer buf) throws IOException
IOException
Copyright © 2007–2019 The Apache Software Foundation. All rights reserved.