Class HFileBlock
- All Implemented Interfaces:
HeapSize,Cacheable,HBaseReferenceCounted,org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
- Direct Known Subclasses:
ExclusiveMemHFileBlock,SharedMemHFileBlock
HFile version 2 file. Version 2 was introduced in hbase-0.92.0.
Version 1 was the original file block. Version 2 was introduced when we changed the hbase file format to support multi-level block indexes and compound bloom filters (HBASE-3857). Support for Version 1 was removed in hbase-1.3.0.
HFileBlock: Version 2
In version 2, a block is structured as follows:- Header: See Writer#putHeader() for where header is written; header total size is
HFILEBLOCK_HEADER_SIZE
- 0. blockType: Magic record identifying the
BlockType(8 bytes): e.g.DATABLK* - 1. onDiskSizeWithoutHeader: Compressed -- a.k.a 'on disk' -- block size, excluding header, but including tailing checksum bytes (4 bytes)
- 2. uncompressedSizeWithoutHeader: Uncompressed block size, excluding header, and excluding checksum bytes (4 bytes)
- 3. prevBlockOffset: The offset of the previous block of the same type (8 bytes). This is used to navigate to the previous block without having to go to the block index
- 4: For minorVersions >=1, the ordinal describing checksum type (1 byte)
- 5: For minorVersions >=1, the number of data bytes/checksum chunk (4 bytes)
- 6: onDiskDataSizeWithHeader: For minorVersions >=1, the size of data 'on disk', including header, excluding checksums (4 bytes)
- 0. blockType: Magic record identifying the
- Raw/Compressed/Encrypted/Encoded data: The compression algorithm is the same for all
the blocks in an
HFile. If compression is NONE, this is just raw, serialized Cells. - Tail: For minorVersions >=1, a series of 4 byte checksums, one each for the number of bytes specified by bytesPerChecksum.
Caching
Caches cache whole blocks with trailing checksums if any. We then tag on some metadata, the content of BLOCK_METADATA_SPACE which will be flag on if we are doing 'hbase' checksums and then the offset into the file which is needed when we re-make a cache key when we return the block to the cache as 'done'. SeeCacheable.serialize(ByteBuffer, boolean) and
Cacheable.getDeserializer().
TODO: Should we cache the checksums? Down in Writer#getBlockForCaching(CacheConfig) where we make a block to cache-on-write, there is an attempt at turning off checksums. This is not the only place we get blocks to cache. We also will cache the raw return from an hdfs read. In this case, the checksums may be present. If the cache is backed by something that doesn't do ECC, say an SSD, we might want to preserve checksums. For now this is open question.
TODO: Over in BucketCache, we save a block allocation by doing a custom serialization. Be sure to change it if serialization changes in here. Could we add a method here that takes an IOEngine and that then serializes to it rather than expose our internals over in BucketCache? IOEngine is in the bucket subpackage. Pull it up? Then this class knows about bucketcache. Ugh.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classstatic interfaceIterator for readingHFileBlocks in load-on-open-section, such as root data index block, meta index block, file info block etc.(package private) static interfaceSomething that can be written into a block.static interfaceAn HFile block reader with iteration ability.(package private) static classReads version 2 HFile blocks from the filesystem.(package private) static classprivate static classData-structure to use caching the header of the NEXT block.(package private) static classUnified version 2HFileblock writer. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate ByteBuffAllocatorstatic final CacheableDeserializer<Cacheable>Used deserializing blocks from Cache.static final intSpace for metadata on a block that gets stored along with the block when we cache it.private BlockTypeType of block.private ByteBuffThe in-memory representation of the hfile block.(package private) static final intEach checksum value is an integer that can be stored in 4 bytes.(package private) static final intOn a checksum failure, do these many succeeding read requests using hdfs checksums before auto-reenabling hbase checksum verification.private static final intstatic final boolean(package private) static final byte[]private final HFileContextMeta data that holds meta information on the hfileblock.static final booleanstatic final longprivate static final org.slf4j.Loggerstatic final intprivate intThe on-disk size of the next block, including the header and checksums if present.private longThe offset of this block in the file.private final intSize on disk of header + data.private intSize on disk excluding header, including checksum.private longThe offset of the previous block on disk.private final intprivate intSize of pure data.private static int -
Constructor Summary
ConstructorsConstructorDescriptionHFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator) Creates a newHFileblock from the given fields. -
Method Summary
Modifier and TypeMethodDescriptionprivate ByteBufferaddMetaData(ByteBuffer destination, boolean includeNextBlockMetadata) Adds metadata at current position (position is moved forward).private ByteBuffAlways allocates a new buffer of the correct size.private intprivate static HFileBlockBuildercreateBuilder(HFileBlock blk, ByteBuff newBuff) Creates a new HFileBlockBuilder from the existing block and a new ByteBuff.(package private) static HFileBlockcreateFromBuff(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) Creates a block from an existing buffer starting with a header.(package private) static HFileBlockbooleanReturns the block type of this cached HFile blockReturns a read-only duplicate of the buffer this block stores internally ready to be read.Returns a buffer that does not include the header and checksum.(package private) int(package private) DataInputStreamReturns a byte stream reading the data(excluding header and checksum) of this block(package private) byte(package private) DataBlockEncoding(package private) shortReturns get data block encoding id that was used to encode this blockReturns CacheableDeserializer instance which reconstructs original object from ByteBuffer.(package private) byte[]Return the appropriate DUMMY_HEADER for the minor versionprivate static byte[]getDummyHeaderForVersion(boolean usesHBaseChecksum) Return the appropriate DUMMY_HEADER for the minor versionFor use by bucketcache.intlongCannot beUNSET.(package private) intReturns the size of data on disk + header.intReturns the on-disk size of header + data part + checksum.private static intgetOnDiskSizeWithHeader(ByteBuff headerBuf, boolean checksumSupport) Parse total on disk size including header and checksum.(package private) intReturns the on-disk size of the data part + checksum (header excluded).(package private) longReturns the offset of the previous block of the same type in the file, or -1 if unknownprivate static io.opentelemetry.api.common.AttributesgetReadDataBlockInternalAttributes(io.opentelemetry.api.trace.Span span) Returns OpenTelemetry Attributes for a Span that is reading a data block with relevant metadata.intReturns the length of the ByteBuffer required to serialized the object.intReturns the uncompressed size of data part (header and checksum excluded).inthashCode()intReturns the size of this block header.static intheaderSize(boolean usesHBaseChecksum) Maps a minor version to the size of the header.longheapSize()Return the approximate 'exclusive deep size' of implementing object.booleanWill be override bySharedMemHFileBlockorExclusiveMemHFileBlock.booleanReturn true when this block's buffer has been unpacked, false otherwise.private voidRewindsbufand writes first 4 header fields.intrefCnt()Reference count of this Cacheable.booleanrelease()CallByteBuff.release()to decrease the reference count, if no other reference, it will return back theByteBuffertoByteBuffAllocatorretain()Increase its reference count, and only when no reference we can free the object's memory.(package private) voidChecks if the block is internally consistent, i.e.private voidsanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) private voidsanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) (package private) voidAn additional sanity-check in case no compression or encryption is being used.voidserialize(ByteBuffer destination, boolean includeNextBlockMetadata) Serializes its data into destination.private static HFileBlockshallowClone(HFileBlock blk, ByteBuff newBuf) toString()(package private) static StringtoStringHeader(ByteBuff buf) Convert the contents of the block header into a human readable string.(package private) intReturn the number of bytes required to store all the checksums for this block.touch()Calling this method in strategic locations where HFileBlocks are referenced may help diagnose potential buffer leaks.unpack(HFileContext fileContext, HFileBlock.FSReader reader) Retrieves the decompressed/decrypted view of this block.Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.hadoop.hbase.nio.HBaseReferenceCounted
release, retain
-
Field Details
-
LOG
-
FIXED_OVERHEAD
-
blockType
Type of block. Header field 0. -
onDiskSizeWithoutHeader
Size on disk excluding header, including checksum. Header field 1. -
uncompressedSizeWithoutHeader
Size of pure data. Does not include header or checksums. Header field 2. -
prevBlockOffset
The offset of the previous block on disk. Header field 3. -
onDiskDataSizeWithHeader
Size on disk of header + data. Excludes checksum. Header field 6, OR calculated fromonDiskSizeWithoutHeaderwhen using HDFS checksum. -
bufWithoutChecksum
The in-memory representation of the hfile block. Can be on or offheap. Can be backed by a single ByteBuffer or by many. Make no assumptions.Be careful reading from this
buf. Duplicate and work on the duplicate or if not, be sure to reset position and limit else trouble down the road.TODO: Make this read-only once made.
We are using the ByteBuff type. ByteBuffer is not extensible yet we need to be able to have a ByteBuffer-like API across multiple ByteBuffers reading from a cache such as BucketCache. So, we have this ByteBuff type. Unfortunately, it is spread all about HFileBlock. Would be good if could be confined to cache-use only but hard-to-do.
NOTE: this byteBuff including HFileBlock header and data, but excluding checksum.
-
fileContext
Meta data that holds meta information on the hfileblock. -
offset
The offset of this block in the file. Populated by the reader for convenience of access. This offset is not part of the block header. -
nextBlockOnDiskSize
The on-disk size of the next block, including the header and checksums if present. UNSET if unknown. Blocks try to carry the size of the next block to read in this data member. Usually we get block sizes from the hfile index but sometimes the index is not available: e.g. when we read the indexes themselves (indexes are stored in blocks, we do not have an index for the indexes). Saves seeks especially around file open when there is a flurry of reading in hfile metadata. -
allocator
-
CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
On a checksum failure, do these many succeeding read requests using hdfs checksums before auto-reenabling hbase checksum verification.- See Also:
-
UNSET
-
FILL_HEADER
- See Also:
-
DONT_FILL_HEADER
- See Also:
-
MULTI_BYTE_BUFFER_HEAP_SIZE
-
BLOCK_METADATA_SPACE
Space for metadata on a block that gets stored along with the block when we cache it. There are a few bytes stuck on the end of the HFileBlock that we pull in from HDFS. 8 bytes are for the offset of this block (long) in the file. Offset is important because is is used when we remake the CacheKey when we return block to the cache when done. There is also a flag on whether checksumming is being done by hbase or not. See class comment for note on uncertain state of checksumming of blocks that come out of cache (should we or should we not?). Finally there are 4 bytes to hold the length of the next block which can save a seek on occasion if available. (This EXTRA info came in with original commit of the bucketcache, HBASE-7404. It was formerly known as EXTRA_SERIALIZATION_SPACE).- See Also:
-
CHECKSUM_SIZE
Each checksum value is an integer that can be stored in 4 bytes.- See Also:
-
DUMMY_HEADER_NO_CHECKSUM
-
BLOCK_DESERIALIZER
Used deserializing blocks from Cache.++++++++++++++ + HFileBlock + ++++++++++++++ + Checksums + <= Optional ++++++++++++++ + Metadata! + <= See note on BLOCK_METADATA_SPACE above. ++++++++++++++- See Also:
-
DESERIALIZER_IDENTIFIER
-
totalChecksumBytes
-
-
Constructor Details
-
HFileBlock
public HFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator) Creates a newHFileblock from the given fields. This constructor is used only while writing blocks and caching, and is sitting in a byte buffer and we want to stuff the block into cache.TODO: The caller presumes no checksumming
TODO: HFile block writer can also off-heap ?
required of this block instance since going into cache; checksum already verified on underlying block data pulled in from filesystem. Is that correct? What if cache is SSD?- Parameters:
blockType- the type of this block, seeBlockTypeonDiskSizeWithoutHeader- seeonDiskSizeWithoutHeaderuncompressedSizeWithoutHeader- seeuncompressedSizeWithoutHeaderprevBlockOffset- seeprevBlockOffsetbuf- block buffer with header (HConstants.HFILEBLOCK_HEADER_SIZEbytes)fillHeader- when true, write the first 4 header fields into passed buffer.offset- the file offset the block was read fromonDiskDataSizeWithHeader- seeonDiskDataSizeWithHeaderfileContext- HFile meta data
-
-
Method Details
-
createFromBuff
static HFileBlock createFromBuff(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) throws IOException Creates a block from an existing buffer starting with a header. Rewinds and takes ownership of the buffer. By definition of rewind, ignores the buffer position, but if you slice the buffer beforehand, it will rewind to that point.- Parameters:
buf- Has header, content, and trailing checksums if present.- Throws:
IOException
-
getOnDiskSizeWithHeader
Parse total on disk size including header and checksum.- Parameters:
headerBuf- Header ByteBuffer. Presumed exact size of header.checksumSupport- true if checksum verification is in use.- Returns:
- Size of the block with header included.
-
getNextBlockOnDiskSize
- Returns:
- the on-disk size of the next block (including the header size and any checksums if present) read by peeking into the next block's header; use as a hint when doing a read of the next block when scanning or running over a file.
-
getBlockType
Description copied from interface:CacheableReturns the block type of this cached HFile block- Specified by:
getBlockTypein interfaceCacheable
-
refCnt
Description copied from interface:CacheableReference count of this Cacheable. -
retain
Description copied from interface:CacheableIncrease its reference count, and only when no reference we can free the object's memory. -
release
CallByteBuff.release()to decrease the reference count, if no other reference, it will return back theByteBuffertoByteBuffAllocator -
touch
Calling this method in strategic locations where HFileBlocks are referenced may help diagnose potential buffer leaks. We pass the block itself as a default hint, but one can usetouch(Object)to pass their own hint as well.- Specified by:
touchin interfaceHBaseReferenceCounted- Specified by:
touchin interfaceorg.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
-
touch
- Specified by:
touchin interfaceHBaseReferenceCounted- Specified by:
touchin interfaceorg.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
-
getDataBlockEncodingId
short getDataBlockEncodingId()Returns get data block encoding id that was used to encode this block -
getOnDiskSizeWithHeader
Returns the on-disk size of header + data part + checksum. -
getOnDiskSizeWithoutHeader
Returns the on-disk size of the data part + checksum (header excluded). -
getUncompressedSizeWithoutHeader
Returns the uncompressed size of data part (header and checksum excluded). -
getPrevBlockOffset
long getPrevBlockOffset()Returns the offset of the previous block of the same type in the file, or -1 if unknown -
overwriteHeader
Rewindsbufand writes first 4 header fields.bufposition is modified as side-effect. -
getBufferWithoutHeader
Returns a buffer that does not include the header and checksum.- Returns:
- the buffer with header skipped and checksum omitted.
-
getBufferReadOnly
Returns a read-only duplicate of the buffer this block stores internally ready to be read. Clients must not modify the buffer object though they may set position and limit on the returned buffer since we pass back a duplicate. This method has to be public because it is used inCompoundBloomFilterto avoid object creation on every Bloom filter lookup, but has to be used with caution. Buffer holds header, block content, and any follow-on checksums if present.- Returns:
- the buffer of this block for read-only operations,the buffer includes header,but not checksum.
-
getByteBuffAllocator
-
sanityCheckAssertion
private void sanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) throws IOException - Throws:
IOException
-
sanityCheckAssertion
private void sanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) throws IOException - Throws:
IOException
-
sanityCheck
Checks if the block is internally consistent, i.e. the firstHConstants.HFILEBLOCK_HEADER_SIZEbytes of the buffer contain a valid header consistent with the fields. Assumes a packed block structure. This function is primary for testing and debugging, and is not thread-safe, because it alters the internal buffer pointer. Used by tests only.- Throws:
IOException
-
toString
-
unpack
@LimitedPrivate("Unittest") public HFileBlock unpack(HFileContext fileContext, HFileBlock.FSReader reader) throws IOException Retrieves the decompressed/decrypted view of this block. An encoded block remains in its encoded structure. Internal structures are shared between instances where applicable.- Throws:
IOException
-
allocateBufferForUnpacking
Always allocates a new buffer of the correct size. Copies header bytes from the existing buffer. Does not change header fields. Reserve room to keep checksum bytes too. -
isUnpacked
Return true when this block's buffer has been unpacked, false otherwise. Note this is a calculated heuristic, not tracked attribute of the block. -
getOffset
Cannot beUNSET. Must be a legitimate value. Used re-making theBlockCacheKeywhen block is returned to the cache.- Returns:
- the offset of this block in the file it was read from
-
getByteStream
Returns a byte stream reading the data(excluding header and checksum) of this block -
heapSize
Description copied from interface:HeapSizeReturn the approximate 'exclusive deep size' of implementing object. Includes count of payload and hosting object sizings. -
sanityCheckUncompressed
An additional sanity-check in case no compression or encryption is being used.- Throws:
IOException
-
getSerializedLength
Description copied from interface:CacheableReturns the length of the ByteBuffer required to serialized the object. If the object cannot be serialized, it should return 0.- Specified by:
getSerializedLengthin interfaceCacheable- Returns:
- int length in bytes of the serialized form or 0 if the object cannot be cached.
-
serialize
Description copied from interface:CacheableSerializes its data into destination. -
getMetaData
For use by bucketcache. This exposes internals. -
addMetaData
Adds metadata at current position (position is moved forward). Does not flip or reset.- Returns:
- The passed
destinationwith metadata added.
-
getDeserializer
Description copied from interface:CacheableReturns CacheableDeserializer instance which reconstructs original object from ByteBuffer.- Specified by:
getDeserializerin interfaceCacheable- Returns:
- CacheableDeserialzer instance.
-
hashCode
-
equals
-
getDataBlockEncoding
-
getChecksumType
byte getChecksumType() -
getBytesPerChecksum
int getBytesPerChecksum() -
getOnDiskDataSizeWithHeader
Returns the size of data on disk + header. Excludes checksum. -
totalChecksumBytes
int totalChecksumBytes()Return the number of bytes required to store all the checksums for this block. Each checksum value is a 4 byte integer.
NOTE: ByteBuff returned bygetBufferWithoutHeader()andgetBufferReadOnly()or DataInputStream returned bygetByteStream()does not include checksum. -
computeTotalChecksumBytes
-
headerSize
Returns the size of this block header. -
headerSize
Maps a minor version to the size of the header. -
getDummyHeaderForVersion
byte[] getDummyHeaderForVersion()Return the appropriate DUMMY_HEADER for the minor version -
getDummyHeaderForVersion
Return the appropriate DUMMY_HEADER for the minor version -
getHFileContext
- Returns:
- This HFileBlocks fileContext which will a derivative of the fileContext for the file from which this block's data was originally read.
-
toStringHeader
Convert the contents of the block header into a human readable string. This is mostly helpful for debugging. This assumes that the block has minor version > 0.- Throws:
IOException
-
createBuilder
Creates a new HFileBlockBuilder from the existing block and a new ByteBuff. The builder will be loaded with all of the original fields from blk, except now using the newBuff and setting isSharedMem based on the source of the passed in newBuff. An existing HFileBlock may have been anExclusiveMemHFileBlock, but the new buffer might call for aSharedMemHFileBlock. Or vice versa.- Parameters:
blk- the block to clone fromnewBuff- the new buffer to use
-
shallowClone
-
deepCloneOnHeap
-
getReadDataBlockInternalAttributes
private static io.opentelemetry.api.common.Attributes getReadDataBlockInternalAttributes(io.opentelemetry.api.trace.Span span) Returns OpenTelemetry Attributes for a Span that is reading a data block with relevant metadata. Will short-circuit if the span isn't going to be captured/OTEL isn't enabled.
-