@InterfaceAudience.Private public class HFileBlock extends Object implements Cacheable
HFile
version 2 file. Version 2 was introduced in hbase-0.92.0.
Version 1 was the original file block. Version 2 was introduced when we changed the hbase file format to support multi-level block indexes and compound bloom filters (HBASE-3857). Support for Version 1 was removed in hbase-1.3.0.
BlockType
(8 bytes): e.g.
DATABLK*
HFile
. If compression is NONE, this is just raw, serialized Cells.
Cacheable.serialize(ByteBuffer, boolean)
and
Cacheable.getDeserializer()
.
TODO: Should we cache the checksums? Down in Writer#getBlockForCaching(CacheConfig) where we make a block to cache-on-write, there is an attempt at turning off checksums. This is not the only place we get blocks to cache. We also will cache the raw return from an hdfs read. In this case, the checksums may be present. If the cache is backed by something that doesn't do ECC, say an SSD, we might want to preserve checksums. For now this is open question.
TODO: Over in BucketCache, we save a block allocation by doing a custom serialization. Be sure to change it if serialization changes in here. Could we add a method here that takes an IOEngine and that then serializes to it rather than expose our internals over in BucketCache? IOEngine is in the bucket subpackage. Pull it up? Then this class knows about bucketcache. Ugh.
Modifier and Type | Class and Description |
---|---|
static class |
HFileBlock.BlockDeserializer |
(package private) static interface |
HFileBlock.BlockIterator
Iterator for reading
HFileBlock s in load-on-open-section, such as root data index
block, meta index block, file info block etc. |
(package private) static interface |
HFileBlock.BlockWritable
Something that can be written into a block.
|
(package private) static interface |
HFileBlock.FSReader
An HFile block reader with iteration ability.
|
(package private) static class |
HFileBlock.FSReaderImpl
Reads version 2 HFile blocks from the filesystem.
|
(package private) static class |
HFileBlock.Header |
private static class |
HFileBlock.PrefetchedHeader
Data-structure to use caching the header of the NEXT block.
|
(package private) static class |
HFileBlock.Writer
Unified version 2
HFile block writer. |
Modifier and Type | Field and Description |
---|---|
private ByteBuffAllocator |
allocator |
static CacheableDeserializer<Cacheable> |
BLOCK_DESERIALIZER
Used deserializing blocks from Cache.
|
static int |
BLOCK_METADATA_SPACE
Space for metadata on a block that gets stored along with the block when we cache it.
|
private BlockType |
blockType
Type of block.
|
private ByteBuff |
buf
The in-memory representation of the hfile block.
|
(package private) static int |
CHECKSUM_SIZE
Each checksum value is an integer that can be stored in 4 bytes.
|
(package private) static int |
CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
On a checksum failure, do these many succeeding read requests using hdfs checksums before
auto-reenabling hbase checksum verification.
|
private static int |
DESERIALIZER_IDENTIFIER |
static boolean |
DONT_FILL_HEADER |
(package private) static byte[] |
DUMMY_HEADER_NO_CHECKSUM |
private HFileContext |
fileContext
Meta data that holds meta information on the hfileblock.
|
static boolean |
FILL_HEADER |
static long |
FIXED_OVERHEAD |
private static org.slf4j.Logger |
LOG |
static int |
MULTI_BYTE_BUFFER_HEAP_SIZE |
private int |
nextBlockOnDiskSize
The on-disk size of the next block, including the header and checksums if present.
|
private long |
offset
The offset of this block in the file.
|
private int |
onDiskDataSizeWithHeader
Size on disk of header + data.
|
private int |
onDiskSizeWithoutHeader
Size on disk excluding header, including checksum.
|
private long |
prevBlockOffset
The offset of the previous block on disk.
|
private int |
uncompressedSizeWithoutHeader
Size of pure data.
|
private static int |
UNSET |
Constructor and Description |
---|
HFileBlock(BlockType blockType,
int onDiskSizeWithoutHeader,
int uncompressedSizeWithoutHeader,
long prevBlockOffset,
ByteBuff buf,
boolean fillHeader,
long offset,
int nextBlockOnDiskSize,
int onDiskDataSizeWithHeader,
HFileContext fileContext,
ByteBuffAllocator allocator)
Creates a new
HFile block from the given fields. |
Modifier and Type | Method and Description |
---|---|
private ByteBuffer |
addMetaData(ByteBuffer destination,
boolean includeNextBlockMetadata)
Adds metadata at current position (position is moved forward).
|
private ByteBuff |
allocateBufferForUnpacking()
Always allocates a new buffer of the correct size.
|
private static HFileBlockBuilder |
createBuilder(HFileBlock blk,
ByteBuff newBuff)
Creates a new HFileBlockBuilder from the existing block and a new ByteBuff.
|
(package private) static HFileBlock |
createFromBuff(ByteBuff buf,
boolean usesHBaseChecksum,
long offset,
int nextBlockOnDiskSize,
HFileContext fileContext,
ByteBuffAllocator allocator)
Creates a block from an existing buffer starting with a header.
|
(package private) static HFileBlock |
deepCloneOnHeap(HFileBlock blk) |
boolean |
equals(Object comparison) |
BlockType |
getBlockType()
Returns the block type of this cached HFile block
|
ByteBuff |
getBufferReadOnly()
Returns a read-only duplicate of the buffer this block stores internally ready to be read.
|
ByteBuff |
getBufferWithoutHeader()
Returns a buffer that does not include the header and checksum.
|
ByteBuffAllocator |
getByteBuffAllocator() |
(package private) int |
getBytesPerChecksum() |
(package private) DataInputStream |
getByteStream()
Returns a byte stream reading the data + checksum of this block
|
(package private) byte |
getChecksumType() |
(package private) DataBlockEncoding |
getDataBlockEncoding() |
(package private) short |
getDataBlockEncodingId()
Returns get data block encoding id that was used to encode this block
|
CacheableDeserializer<Cacheable> |
getDeserializer()
Returns CacheableDeserializer instance which reconstructs original object from ByteBuffer.
|
(package private) byte[] |
getDummyHeaderForVersion()
Return the appropriate DUMMY_HEADER for the minor version
|
private static byte[] |
getDummyHeaderForVersion(boolean usesHBaseChecksum)
Return the appropriate DUMMY_HEADER for the minor version
|
HFileContext |
getHFileContext() |
ByteBuffer |
getMetaData(ByteBuffer bb)
For use by bucketcache.
|
(package private) int |
getNextBlockOnDiskSize() |
(package private) long |
getOffset()
Cannot be
UNSET . |
(package private) int |
getOnDiskDataSizeWithHeader()
Returns the size of data on disk + header.
|
int |
getOnDiskSizeWithHeader()
Returns the on-disk size of header + data part + checksum.
|
private static int |
getOnDiskSizeWithHeader(ByteBuff headerBuf,
boolean verifyChecksum)
Parse total on disk size including header and checksum.
|
(package private) int |
getOnDiskSizeWithoutHeader()
Returns the on-disk size of the data part + checksum (header excluded).
|
(package private) long |
getPrevBlockOffset()
Returns the offset of the previous block of the same type in the file, or -1 if unknown
|
int |
getSerializedLength()
Returns the length of the ByteBuffer required to serialized the object.
|
(package private) int |
getUncompressedSizeWithoutHeader()
Returns the uncompressed size of data part (header and checksum excluded).
|
int |
hashCode() |
int |
headerSize()
Returns the size of this block header.
|
static int |
headerSize(boolean usesHBaseChecksum)
Maps a minor version to the size of the header.
|
long |
heapSize()
Return the approximate 'exclusive deep size' of implementing object.
|
boolean |
isSharedMem()
Will be override by
SharedMemHFileBlock or ExclusiveMemHFileBlock . |
boolean |
isUnpacked()
Return true when this block's buffer has been unpacked, false otherwise.
|
private void |
overwriteHeader()
Rewinds
buf and writes first 4 header fields. |
int |
refCnt()
Reference count of this Cacheable.
|
boolean |
release()
Call
ByteBuff.release() to decrease the reference count, if no other reference, it will
return back the ByteBuffer to ByteBuffAllocator |
HFileBlock |
retain()
Increase its reference count, and only when no reference we can free the object's memory.
|
(package private) void |
sanityCheck()
Checks if the block is internally consistent, i.e.
|
private void |
sanityCheckAssertion(BlockType valueFromBuf,
BlockType valueFromField) |
private void |
sanityCheckAssertion(long valueFromBuf,
long valueFromField,
String fieldName) |
(package private) void |
sanityCheckUncompressed()
An additional sanity-check in case no compression or encryption is being used.
|
void |
serialize(ByteBuffer destination,
boolean includeNextBlockMetadata)
Serializes its data into destination.
|
private static HFileBlock |
shallowClone(HFileBlock blk,
ByteBuff newBuf) |
String |
toString() |
(package private) static String |
toStringHeader(ByteBuff buf)
Convert the contents of the block header into a human readable string.
|
(package private) int |
totalChecksumBytes()
Calculate the number of bytes required to store all the checksums for this block.
|
HFileBlock |
touch()
Calling this method in strategic locations where HFileBlocks are referenced may help diagnose
potential buffer leaks.
|
HFileBlock |
touch(Object hint) |
(package private) HFileBlock |
unpack(HFileContext fileContext,
HFileBlock.FSReader reader)
Retrieves the decompressed/decrypted view of this block.
|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
release, retain
private static final org.slf4j.Logger LOG
public static final long FIXED_OVERHEAD
private int onDiskSizeWithoutHeader
private int uncompressedSizeWithoutHeader
private long prevBlockOffset
private int onDiskDataSizeWithHeader
onDiskSizeWithoutHeader
when using HDFS checksum.private ByteBuff buf
Be careful reading from this buf
. Duplicate and work on the duplicate or if not,
be sure to reset position and limit else trouble down the road.
TODO: Make this read-only once made.
We are using the ByteBuff type. ByteBuffer is not extensible yet we need to be able to have a ByteBuffer-like API across multiple ByteBuffers reading from a cache such as BucketCache. So, we have this ByteBuff type. Unfortunately, it is spread all about HFileBlock. Would be good if could be confined to cache-use only but hard-to-do.
private HFileContext fileContext
private long offset
private int nextBlockOnDiskSize
private ByteBuffAllocator allocator
static final int CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD
private static int UNSET
public static final boolean FILL_HEADER
public static final boolean DONT_FILL_HEADER
public static final int MULTI_BYTE_BUFFER_HEAP_SIZE
public static final int BLOCK_METADATA_SPACE
static final int CHECKSUM_SIZE
static final byte[] DUMMY_HEADER_NO_CHECKSUM
public static final CacheableDeserializer<Cacheable> BLOCK_DESERIALIZER
++++++++++++++
+ HFileBlock +
++++++++++++++
+ Checksums + <= Optional
++++++++++++++
+ Metadata! + <= See note on BLOCK_METADATA_SPACE above.
++++++++++++++
serialize(ByteBuffer, boolean)
private static final int DESERIALIZER_IDENTIFIER
public HFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator)
HFile
block from the given fields. This constructor is used only while
writing blocks and caching, and is sitting in a byte buffer and we want to stuff the block into
cache. See HFileBlock.Writer.getBlockForCaching(CacheConfig)
.
TODO: The caller presumes no checksumming
TODO: HFile block writer can also off-heap ?
required of this block instance since going into cache; checksum already verified on underlying block data pulled in from filesystem. Is that correct? What if cache is SSD?blockType
- the type of this block, see BlockType
onDiskSizeWithoutHeader
- see onDiskSizeWithoutHeader
uncompressedSizeWithoutHeader
- see uncompressedSizeWithoutHeader
prevBlockOffset
- see prevBlockOffset
buf
- block buffer with header
(HConstants.HFILEBLOCK_HEADER_SIZE
bytes)fillHeader
- when true, write the first 4 header fields into passed
buffer.offset
- the file offset the block was read fromonDiskDataSizeWithHeader
- see onDiskDataSizeWithHeader
fileContext
- HFile meta datastatic HFileBlock createFromBuff(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) throws IOException
buf
- Has header, content, and trailing checksums if present.IOException
private static int getOnDiskSizeWithHeader(ByteBuff headerBuf, boolean verifyChecksum)
headerBuf
- Header ByteBuffer. Presumed exact size of header.verifyChecksum
- true if checksum verification is in use.int getNextBlockOnDiskSize()
public BlockType getBlockType()
Cacheable
getBlockType
in interface Cacheable
public int refCnt()
Cacheable
public HFileBlock retain()
Cacheable
public boolean release()
ByteBuff.release()
to decrease the reference count, if no other reference, it will
return back the ByteBuffer
to ByteBuffAllocator
public HFileBlock touch()
touch(Object)
to pass their own hint as well.touch
in interface HBaseReferenceCounted
touch
in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
public HFileBlock touch(Object hint)
touch
in interface HBaseReferenceCounted
touch
in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
short getDataBlockEncodingId()
public int getOnDiskSizeWithHeader()
int getOnDiskSizeWithoutHeader()
int getUncompressedSizeWithoutHeader()
long getPrevBlockOffset()
private void overwriteHeader()
buf
and writes first 4 header fields. buf
position is modified as
side-effect.public ByteBuff getBufferWithoutHeader()
public ByteBuff getBufferReadOnly()
CompoundBloomFilter
to avoid object creation on every Bloom filter lookup, but has
to be used with caution. Buffer holds header, block content, and any follow-on checksums if
present.public ByteBuffAllocator getByteBuffAllocator()
private void sanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) throws IOException
IOException
private void sanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) throws IOException
IOException
void sanityCheck() throws IOException
HConstants.HFILEBLOCK_HEADER_SIZE
bytes of the buffer contain a valid header consistent
with the fields. Assumes a packed block structure. This function is primary for testing and
debugging, and is not thread-safe, because it alters the internal buffer pointer. Used by tests
only.IOException
HFileBlock unpack(HFileContext fileContext, HFileBlock.FSReader reader) throws IOException
IOException
private ByteBuff allocateBufferForUnpacking()
public boolean isUnpacked()
long getOffset()
UNSET
. Must be a legitimate value. Used re-making the BlockCacheKey
when block is returned to the cache.DataInputStream getByteStream()
public long heapSize()
HeapSize
public boolean isSharedMem()
SharedMemHFileBlock
or ExclusiveMemHFileBlock
. Return true
by default.void sanityCheckUncompressed() throws IOException
IOException
public int getSerializedLength()
Cacheable
getSerializedLength
in interface Cacheable
public void serialize(ByteBuffer destination, boolean includeNextBlockMetadata)
Cacheable
public ByteBuffer getMetaData(ByteBuffer bb)
private ByteBuffer addMetaData(ByteBuffer destination, boolean includeNextBlockMetadata)
destination
with metadata added.public CacheableDeserializer<Cacheable> getDeserializer()
Cacheable
getDeserializer
in interface Cacheable
DataBlockEncoding getDataBlockEncoding()
byte getChecksumType()
int getBytesPerChecksum()
int getOnDiskDataSizeWithHeader()
int totalChecksumBytes()
public int headerSize()
public static int headerSize(boolean usesHBaseChecksum)
byte[] getDummyHeaderForVersion()
private static byte[] getDummyHeaderForVersion(boolean usesHBaseChecksum)
public HFileContext getHFileContext()
static String toStringHeader(ByteBuff buf) throws IOException
IOException
private static HFileBlockBuilder createBuilder(HFileBlock blk, ByteBuff newBuff)
ExclusiveMemHFileBlock
, but the new buffer might call for a
SharedMemHFileBlock
. Or vice versa.blk
- the block to clone fromnewBuff
- the new buffer to useprivate static HFileBlock shallowClone(HFileBlock blk, ByteBuff newBuf)
static HFileBlock deepCloneOnHeap(HFileBlock blk)
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.