Class HFileBlock
- All Implemented Interfaces:
HeapSize
,Cacheable
,HBaseReferenceCounted
,org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
- Direct Known Subclasses:
ExclusiveMemHFileBlock
,SharedMemHFileBlock
HFile
version 2 file. Version 2 was introduced in hbase-0.92.0.
Version 1 was the original file block. Version 2 was introduced when we changed the hbase file format to support multi-level block indexes and compound bloom filters (HBASE-3857). Support for Version 1 was removed in hbase-1.3.0.
HFileBlock: Version 2
In version 2, a block is structured as follows:- Header: See Writer#putHeader() for where header is written; header total size is
HFILEBLOCK_HEADER_SIZE
- 0. blockType: Magic record identifying the
BlockType
(8 bytes): e.g.DATABLK*
- 1. onDiskSizeWithoutHeader: Compressed -- a.k.a 'on disk' -- block size, excluding header, but including tailing checksum bytes (4 bytes)
- 2. uncompressedSizeWithoutHeader: Uncompressed block size, excluding header, and excluding checksum bytes (4 bytes)
- 3. prevBlockOffset: The offset of the previous block of the same type (8 bytes). This is used to navigate to the previous block without having to go to the block index
- 4: For minorVersions >=1, the ordinal describing checksum type (1 byte)
- 5: For minorVersions >=1, the number of data bytes/checksum chunk (4 bytes)
- 6: onDiskDataSizeWithHeader: For minorVersions >=1, the size of data 'on disk', including header, excluding checksums (4 bytes)
- 0. blockType: Magic record identifying the
- Raw/Compressed/Encrypted/Encoded data: The compression algorithm is the same for all
the blocks in an
HFile
. If compression is NONE, this is just raw, serialized Cells. - Tail: For minorVersions >=1, a series of 4 byte checksums, one each for the number of bytes specified by bytesPerChecksum.
Caching
Caches cache whole blocks with trailing checksums if any. We then tag on some metadata, the content of BLOCK_METADATA_SPACE which will be flag on if we are doing 'hbase' checksums and then the offset into the file which is needed when we re-make a cache key when we return the block to the cache as 'done'. SeeCacheable.serialize(ByteBuffer, boolean)
and
Cacheable.getDeserializer()
.
TODO: Should we cache the checksums? Down in Writer#getBlockForCaching(CacheConfig) where we make a block to cache-on-write, there is an attempt at turning off checksums. This is not the only place we get blocks to cache. We also will cache the raw return from an hdfs read. In this case, the checksums may be present. If the cache is backed by something that doesn't do ECC, say an SSD, we might want to preserve checksums. For now this is open question.
TODO: Over in BucketCache, we save a block allocation by doing a custom serialization. Be sure to change it if serialization changes in here. Could we add a method here that takes an IOEngine and that then serializes to it rather than expose our internals over in BucketCache? IOEngine is in the bucket subpackage. Pull it up? Then this class knows about bucketcache. Ugh.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic final class
(package private) static interface
Iterator for readingHFileBlock
s in load-on-open-section, such as root data index block, meta index block, file info block etc.(package private) static interface
Something that can be written into a block.(package private) static interface
An HFile block reader with iteration ability.(package private) static class
Reads version 2 HFile blocks from the filesystem.(package private) static class
private static class
Data-structure to use caching the header of the NEXT block.(package private) static class
Unified version 2HFile
block writer. -
Field Summary
Modifier and TypeFieldDescriptionprivate ByteBuffAllocator
static final CacheableDeserializer<Cacheable>
Used deserializing blocks from Cache.static final int
Space for metadata on a block that gets stored along with the block when we cache it.private BlockType
Type of block.private ByteBuff
The in-memory representation of the hfile block.(package private) static final int
Each checksum value is an integer that can be stored in 4 bytes.(package private) static final int
On a checksum failure, do these many succeeding read requests using hdfs checksums before auto-reenabling hbase checksum verification.private static final int
static final boolean
(package private) static final byte[]
private final HFileContext
Meta data that holds meta information on the hfileblock.static final boolean
static final long
private static final org.slf4j.Logger
static final int
private int
The on-disk size of the next block, including the header and checksums if present.private long
The offset of this block in the file.private final int
Size on disk of header + data.private int
Size on disk excluding header, including checksum.private long
The offset of the previous block on disk.private final int
private int
Size of pure data.private static int
-
Constructor Summary
ConstructorDescriptionHFileBlock
(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator) Creates a newHFile
block from the given fields. -
Method Summary
Modifier and TypeMethodDescriptionprivate ByteBuffer
addMetaData
(ByteBuffer destination, boolean includeNextBlockMetadata) Adds metadata at current position (position is moved forward).private ByteBuff
Always allocates a new buffer of the correct size.private int
private static HFileBlockBuilder
createBuilder
(HFileBlock blk, ByteBuff newBuff) Creates a new HFileBlockBuilder from the existing block and a new ByteBuff.(package private) static HFileBlock
createFromBuff
(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) Creates a block from an existing buffer starting with a header.(package private) static HFileBlock
boolean
Returns the block type of this cached HFile blockReturns a read-only duplicate of the buffer this block stores internally ready to be read.Returns a buffer that does not include the header and checksum.(package private) int
(package private) DataInputStream
Returns a byte stream reading the data(excluding header and checksum) of this block(package private) byte
(package private) DataBlockEncoding
(package private) short
Returns get data block encoding id that was used to encode this blockReturns CacheableDeserializer instance which reconstructs original object from ByteBuffer.(package private) byte[]
Return the appropriate DUMMY_HEADER for the minor versionprivate static byte[]
getDummyHeaderForVersion
(boolean usesHBaseChecksum) Return the appropriate DUMMY_HEADER for the minor versionFor use by bucketcache.(package private) int
long
Cannot beUNSET
.(package private) int
Returns the size of data on disk + header.int
Returns the on-disk size of header + data part + checksum.private static int
getOnDiskSizeWithHeader
(ByteBuff headerBuf, boolean checksumSupport) Parse total on disk size including header and checksum.(package private) int
Returns the on-disk size of the data part + checksum (header excluded).(package private) long
Returns the offset of the previous block of the same type in the file, or -1 if unknownint
Returns the length of the ByteBuffer required to serialized the object.int
Returns the uncompressed size of data part (header and checksum excluded).int
hashCode()
int
Returns the size of this block header.static int
headerSize
(boolean usesHBaseChecksum) Maps a minor version to the size of the header.long
heapSize()
Return the approximate 'exclusive deep size' of implementing object.boolean
Will be override bySharedMemHFileBlock
orExclusiveMemHFileBlock
.boolean
Return true when this block's buffer has been unpacked, false otherwise.private void
Rewindsbuf
and writes first 4 header fields.int
refCnt()
Reference count of this Cacheable.boolean
release()
CallByteBuff.release()
to decrease the reference count, if no other reference, it will return back theByteBuffer
toByteBuffAllocator
retain()
Increase its reference count, and only when no reference we can free the object's memory.(package private) void
Checks if the block is internally consistent, i.e.private void
sanityCheckAssertion
(long valueFromBuf, long valueFromField, String fieldName) private void
sanityCheckAssertion
(BlockType valueFromBuf, BlockType valueFromField) (package private) void
An additional sanity-check in case no compression or encryption is being used.void
serialize
(ByteBuffer destination, boolean includeNextBlockMetadata) Serializes its data into destination.private static HFileBlock
shallowClone
(HFileBlock blk, ByteBuff newBuf) toString()
(package private) static String
toStringHeader
(ByteBuff buf) Convert the contents of the block header into a human readable string.(package private) int
Return the number of bytes required to store all the checksums for this block.touch()
Calling this method in strategic locations where HFileBlocks are referenced may help diagnose potential buffer leaks.(package private) HFileBlock
unpack
(HFileContext fileContext, HFileBlock.FSReader reader) Retrieves the decompressed/decrypted view of this block.Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.hadoop.hbase.nio.HBaseReferenceCounted
release, retain
-
Field Details
-
LOG
-
FIXED_OVERHEAD
-
blockType
Type of block. Header field 0. -
onDiskSizeWithoutHeader
Size on disk excluding header, including checksum. Header field 1. -
uncompressedSizeWithoutHeader
Size of pure data. Does not include header or checksums. Header field 2. -
prevBlockOffset
The offset of the previous block on disk. Header field 3. -
onDiskDataSizeWithHeader
Size on disk of header + data. Excludes checksum. Header field 6, OR calculated fromonDiskSizeWithoutHeader
when using HDFS checksum. -
bufWithoutChecksum
The in-memory representation of the hfile block. Can be on or offheap. Can be backed by a single ByteBuffer or by many. Make no assumptions.Be careful reading from this
buf
. Duplicate and work on the duplicate or if not, be sure to reset position and limit else trouble down the road.TODO: Make this read-only once made.
We are using the ByteBuff type. ByteBuffer is not extensible yet we need to be able to have a ByteBuffer-like API across multiple ByteBuffers reading from a cache such as BucketCache. So, we have this ByteBuff type. Unfortunately, it is spread all about HFileBlock. Would be good if could be confined to cache-use only but hard-to-do.
NOTE: this byteBuff including HFileBlock header and data, but excluding checksum.
-
fileContext
Meta data that holds meta information on the hfileblock. -
offset
The offset of this block in the file. Populated by the reader for convenience of access. This offset is not part of the block header. -
nextBlockOnDiskSize
The on-disk size of the next block, including the header and checksums if present. UNSET if unknown. Blocks try to carry the size of the next block to read in this data member. Usually we get block sizes from the hfile index but sometimes the index is not available: e.g. when we read the indexes themselves (indexes are stored in blocks, we do not have an index for the indexes). Saves seeks especially around file open when there is a flurry of reading in hfile metadata. -
allocator
-
CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
On a checksum failure, do these many succeeding read requests using hdfs checksums before auto-reenabling hbase checksum verification.- See Also:
-
UNSET
-
FILL_HEADER
- See Also:
-
DONT_FILL_HEADER
- See Also:
-
MULTI_BYTE_BUFFER_HEAP_SIZE
-
BLOCK_METADATA_SPACE
Space for metadata on a block that gets stored along with the block when we cache it. There are a few bytes stuck on the end of the HFileBlock that we pull in from HDFS. 8 bytes are for the offset of this block (long) in the file. Offset is important because is is used when we remake the CacheKey when we return block to the cache when done. There is also a flag on whether checksumming is being done by hbase or not. See class comment for note on uncertain state of checksumming of blocks that come out of cache (should we or should we not?). Finally there are 4 bytes to hold the length of the next block which can save a seek on occasion if available. (This EXTRA info came in with original commit of the bucketcache, HBASE-7404. It was formerly known as EXTRA_SERIALIZATION_SPACE).- See Also:
-
CHECKSUM_SIZE
Each checksum value is an integer that can be stored in 4 bytes.- See Also:
-
DUMMY_HEADER_NO_CHECKSUM
-
BLOCK_DESERIALIZER
Used deserializing blocks from Cache.++++++++++++++ + HFileBlock + ++++++++++++++ + Checksums + <= Optional ++++++++++++++ + Metadata! + <= See note on BLOCK_METADATA_SPACE above. ++++++++++++++
- See Also:
-
DESERIALIZER_IDENTIFIER
-
totalChecksumBytes
-
-
Constructor Details
-
HFileBlock
public HFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator) Creates a newHFile
block from the given fields. This constructor is used only while writing blocks and caching, and is sitting in a byte buffer and we want to stuff the block into cache.TODO: The caller presumes no checksumming
TODO: HFile block writer can also off-heap ?
required of this block instance since going into cache; checksum already verified on underlying block data pulled in from filesystem. Is that correct? What if cache is SSD?- Parameters:
blockType
- the type of this block, seeBlockType
onDiskSizeWithoutHeader
- seeonDiskSizeWithoutHeader
uncompressedSizeWithoutHeader
- seeuncompressedSizeWithoutHeader
prevBlockOffset
- seeprevBlockOffset
buf
- block buffer with header (HConstants.HFILEBLOCK_HEADER_SIZE
bytes)fillHeader
- when true, write the first 4 header fields into passed buffer.offset
- the file offset the block was read fromonDiskDataSizeWithHeader
- seeonDiskDataSizeWithHeader
fileContext
- HFile meta data
-
-
Method Details
-
createFromBuff
static HFileBlock createFromBuff(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) throws IOException Creates a block from an existing buffer starting with a header. Rewinds and takes ownership of the buffer. By definition of rewind, ignores the buffer position, but if you slice the buffer beforehand, it will rewind to that point.- Parameters:
buf
- Has header, content, and trailing checksums if present.- Throws:
IOException
-
getOnDiskSizeWithHeader
Parse total on disk size including header and checksum.- Parameters:
headerBuf
- Header ByteBuffer. Presumed exact size of header.checksumSupport
- true if checksum verification is in use.- Returns:
- Size of the block with header included.
-
getNextBlockOnDiskSize
int getNextBlockOnDiskSize()- Returns:
- the on-disk size of the next block (including the header size and any checksums if present) read by peeking into the next block's header; use as a hint when doing a read of the next block when scanning or running over a file.
-
getBlockType
Description copied from interface:Cacheable
Returns the block type of this cached HFile block- Specified by:
getBlockType
in interfaceCacheable
-
refCnt
Description copied from interface:Cacheable
Reference count of this Cacheable. -
retain
Description copied from interface:Cacheable
Increase its reference count, and only when no reference we can free the object's memory. -
release
CallByteBuff.release()
to decrease the reference count, if no other reference, it will return back theByteBuffer
toByteBuffAllocator
-
touch
Calling this method in strategic locations where HFileBlocks are referenced may help diagnose potential buffer leaks. We pass the block itself as a default hint, but one can usetouch(Object)
to pass their own hint as well.- Specified by:
touch
in interfaceHBaseReferenceCounted
- Specified by:
touch
in interfaceorg.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
-
touch
- Specified by:
touch
in interfaceHBaseReferenceCounted
- Specified by:
touch
in interfaceorg.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
-
getDataBlockEncodingId
short getDataBlockEncodingId()Returns get data block encoding id that was used to encode this block -
getOnDiskSizeWithHeader
Returns the on-disk size of header + data part + checksum. -
getOnDiskSizeWithoutHeader
Returns the on-disk size of the data part + checksum (header excluded). -
getUncompressedSizeWithoutHeader
Returns the uncompressed size of data part (header and checksum excluded). -
getPrevBlockOffset
long getPrevBlockOffset()Returns the offset of the previous block of the same type in the file, or -1 if unknown -
overwriteHeader
Rewindsbuf
and writes first 4 header fields.buf
position is modified as side-effect. -
getBufferWithoutHeader
Returns a buffer that does not include the header and checksum.- Returns:
- the buffer with header skipped and checksum omitted.
-
getBufferReadOnly
Returns a read-only duplicate of the buffer this block stores internally ready to be read. Clients must not modify the buffer object though they may set position and limit on the returned buffer since we pass back a duplicate. This method has to be public because it is used inCompoundBloomFilter
to avoid object creation on every Bloom filter lookup, but has to be used with caution. Buffer holds header, block content, and any follow-on checksums if present.- Returns:
- the buffer of this block for read-only operations,the buffer includes header,but not checksum.
-
getByteBuffAllocator
-
sanityCheckAssertion
private void sanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) throws IOException - Throws:
IOException
-
sanityCheckAssertion
private void sanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) throws IOException - Throws:
IOException
-
sanityCheck
Checks if the block is internally consistent, i.e. the firstHConstants.HFILEBLOCK_HEADER_SIZE
bytes of the buffer contain a valid header consistent with the fields. Assumes a packed block structure. This function is primary for testing and debugging, and is not thread-safe, because it alters the internal buffer pointer. Used by tests only.- Throws:
IOException
-
toString
-
unpack
Retrieves the decompressed/decrypted view of this block. An encoded block remains in its encoded structure. Internal structures are shared between instances where applicable.- Throws:
IOException
-
allocateBufferForUnpacking
Always allocates a new buffer of the correct size. Copies header bytes from the existing buffer. Does not change header fields. Reserve room to keep checksum bytes too. -
isUnpacked
Return true when this block's buffer has been unpacked, false otherwise. Note this is a calculated heuristic, not tracked attribute of the block. -
getOffset
Cannot beUNSET
. Must be a legitimate value. Used re-making theBlockCacheKey
when block is returned to the cache.- Returns:
- the offset of this block in the file it was read from
-
getByteStream
Returns a byte stream reading the data(excluding header and checksum) of this block -
heapSize
Description copied from interface:HeapSize
Return the approximate 'exclusive deep size' of implementing object. Includes count of payload and hosting object sizings. -
sanityCheckUncompressed
An additional sanity-check in case no compression or encryption is being used.- Throws:
IOException
-
getSerializedLength
Description copied from interface:Cacheable
Returns the length of the ByteBuffer required to serialized the object. If the object cannot be serialized, it should return 0.- Specified by:
getSerializedLength
in interfaceCacheable
- Returns:
- int length in bytes of the serialized form or 0 if the object cannot be cached.
-
serialize
Description copied from interface:Cacheable
Serializes its data into destination. -
getMetaData
For use by bucketcache. This exposes internals. -
addMetaData
Adds metadata at current position (position is moved forward). Does not flip or reset.- Returns:
- The passed
destination
with metadata added.
-
getDeserializer
Description copied from interface:Cacheable
Returns CacheableDeserializer instance which reconstructs original object from ByteBuffer.- Specified by:
getDeserializer
in interfaceCacheable
- Returns:
- CacheableDeserialzer instance.
-
hashCode
-
equals
-
getDataBlockEncoding
-
getChecksumType
byte getChecksumType() -
getBytesPerChecksum
int getBytesPerChecksum() -
getOnDiskDataSizeWithHeader
Returns the size of data on disk + header. Excludes checksum. -
totalChecksumBytes
int totalChecksumBytes()Return the number of bytes required to store all the checksums for this block. Each checksum value is a 4 byte integer.
NOTE: ByteBuff returned bygetBufferWithoutHeader()
andgetBufferReadOnly()
or DataInputStream returned bygetByteStream()
does not include checksum. -
computeTotalChecksumBytes
-
headerSize
Returns the size of this block header. -
headerSize
Maps a minor version to the size of the header. -
getDummyHeaderForVersion
byte[] getDummyHeaderForVersion()Return the appropriate DUMMY_HEADER for the minor version -
getDummyHeaderForVersion
Return the appropriate DUMMY_HEADER for the minor version -
getHFileContext
- Returns:
- This HFileBlocks fileContext which will a derivative of the fileContext for the file from which this block's data was originally read.
-
toStringHeader
Convert the contents of the block header into a human readable string. This is mostly helpful for debugging. This assumes that the block has minor version > 0.- Throws:
IOException
-
createBuilder
Creates a new HFileBlockBuilder from the existing block and a new ByteBuff. The builder will be loaded with all of the original fields from blk, except now using the newBuff and setting isSharedMem based on the source of the passed in newBuff. An existing HFileBlock may have been anExclusiveMemHFileBlock
, but the new buffer might call for aSharedMemHFileBlock
. Or vice versa.- Parameters:
blk
- the block to clone fromnewBuff
- the new buffer to use
-
shallowClone
-
deepCloneOnHeap
-