Package org.apache.hadoop.hbase.io.hfile
Class HFileReaderImpl
java.lang.Object
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl
- All Implemented Interfaces:
Closeable
,AutoCloseable
,org.apache.hadoop.conf.Configurable
,HFile.CachingBlockReader
,HFile.Reader
- Direct Known Subclasses:
HFilePreadReader
,HFileStreamReader
@Private
public abstract class HFileReaderImpl
extends Object
implements HFile.Reader, org.apache.hadoop.conf.Configurable
Implementation that can handle all hfile versions of
HFile.Reader
.-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
protected static class
Scanner that operates on encoded data blocks.static class
static class
An exception thrown when an operation requiring a scanner to be seeked is invoked on a scanner that is not seeked. -
Field Summary
Modifier and TypeFieldDescriptionprotected final CacheConfig
Block cache configuration.private org.apache.hadoop.conf.Configuration
protected ReaderContext
protected HFileDataBlockEncoder
What kind of data block encoding should be used while reading, writing, and handling cache.Data block index reader keeping the root data index in memoryprotected final HFileInfo
protected HFileBlock.FSReader
Filesystem-level block reader.protected HFileContext
static final int
The size of a (key length, value length) tuple that prefixes each entry in a data block.private static final org.slf4j.Logger
(package private) static final int
Maximum minor version supported by this HFile formatMeta block index reader -- always single level(package private) static final int
Minimum minor version supported by this HFile formatstatic final int
In HFile minor version that does not support checksumsstatic final int
Minor versions in HFile starting with this number have hbase checksums(package private) static final int
Minor versions starting with this number have faked index keyprotected final String
File name to be used for block namesprivate IdLock
A "sparse lock" implementation allowing to lock on a particular block identified by offset.protected final org.apache.hadoop.fs.Path
Path of filestatic final int
HFile minor version that introduced pbuf filetrailerprivate final boolean
protected FixedFileTrailer
-
Constructor Summary
ConstructorDescriptionHFileReaderImpl
(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf) Opens a HFile. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
private DataInput
getBloomFilterMetadata
(BlockType blockType) private HFileBlock
getCachedBlock
(BlockCacheKey cacheKey, boolean cacheBlock, boolean useLock, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) Retrieve block from cache.Returns comparatororg.apache.hadoop.conf.Configuration
getConf()
Retrieves delete family Bloom filter metadata as appropriate for eachHFile
version.getEffectiveEncodingInCache
(boolean isCompaction) long
Returns number of KV entries in this HFileReturn the file context of the HFile this reader belongs toOptional<byte[]>
Returns a buffer with the Bloom filter metadata.Optional<byte[]>
int
getMetaBlock
(String metaBlockName, boolean cacheBlock) getName()
Returns this reader's "name".org.apache.hadoop.fs.Path
getPath()
getScanner
(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread) Create a Scanner on this file.getScanner
(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread, boolean isCompaction) Create a Scanner on this file.For testingboolean
long
boolean
boolean
long
length()
midKey()
boolean
Returns false if block prefetching was requested for this file and has not completed, true otherwiseboolean
Returns true if block prefetching was started after waiting for specified delay, false otherwisereadBlock
(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) Read in a file block.readBlock
(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding, boolean cacheOnly) private void
returnAndEvictBlock
(BlockCache cache, BlockCacheKey cacheKey, Cacheable block) void
setConf
(org.apache.hadoop.conf.Configuration conf) void
setDataBlockEncoder
(HFileDataBlockEncoder dataBlockEncoder) void
void
private boolean
shouldUseHeap
(BlockType expectedBlockType, boolean cacheBlock) Whether we use heap or not depends on our intent to cache the block.toString()
void
To close the stream's socket.private void
validateBlockType
(HFileBlock block, BlockType expectedBlockType) Compares the actual type of a block retrieved from cache or disk with its expected type and throws an exception in case of a mismatch.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.hadoop.hbase.io.hfile.HFile.Reader
close
-
Field Details
-
LOG
-
dataBlockIndexReader
Data block index reader keeping the root data index in memory -
metaBlockIndexReader
Meta block index reader -- always single level -
trailer
-
-
dataBlockEncoder
What kind of data block encoding should be used while reading, writing, and handling cache. -
cacheConf
Block cache configuration. -
context
-
fileInfo
-
path
Path of file -
name
File name to be used for block names -
conf
-
hfileContext
-
fsBlockReader
Filesystem-level block reader. -
offsetLock
A "sparse lock" implementation allowing to lock on a particular block identified by offset. The purpose of this is to avoid two clients loading the same block, and have all but one client wait to get the block from the cache. -
MIN_MINOR_VERSION
Minimum minor version supported by this HFile format- See Also:
-
MAX_MINOR_VERSION
Maximum minor version supported by this HFile format- See Also:
-
MINOR_VERSION_WITH_FAKED_KEY
Minor versions starting with this number have faked index key- See Also:
-
MINOR_VERSION_WITH_CHECKSUM
Minor versions in HFile starting with this number have hbase checksums- See Also:
-
MINOR_VERSION_NO_CHECKSUM
In HFile minor version that does not support checksums- See Also:
-
PBUF_TRAILER_MINOR_VERSION
HFile minor version that introduced pbuf filetrailer- See Also:
-
KEY_VALUE_LEN_SIZE
The size of a (key length, value length) tuple that prefixes each entry in a data block.- See Also:
-
-
Constructor Details
-
HFileReaderImpl
public HFileReaderImpl(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf) throws IOException Opens a HFile.- Parameters:
context
- Reader context infofileInfo
- HFile infocacheConf
- Cache configuration.conf
- Configuration- Throws:
IOException
-
-
Method Details
-
getCacheConf
-
toStringFirstKey
-
toStringLastKey
-
toString
-
length
- Specified by:
length
in interfaceHFile.Reader
-
getFirstKey
- Specified by:
getFirstKey
in interfaceHFile.Reader
- Returns:
- the first key in the file. May be null if file has no entries. Note that this is not the first row key, but rather the byte form of the first KeyValue.
-
getFirstRowKey
TODO left fromHFile
version 1: move this to StoreFile after Ryan's patch goes in to eliminateKeyValue
here.- Specified by:
getFirstRowKey
in interfaceHFile.Reader
- Returns:
- the first row key, or null if the file is empty.
-
getLastRowKey
TODO left fromHFile
version 1: move this to StoreFile after Ryan's patch goes in to eliminateKeyValue
here.- Specified by:
getLastRowKey
in interfaceHFile.Reader
- Returns:
- the last row key, or null if the file is empty.
-
getEntries
Returns number of KV entries in this HFile- Specified by:
getEntries
in interfaceHFile.Reader
-
getComparator
Returns comparator- Specified by:
getComparator
in interfaceHFile.Reader
-
getCompressionAlgorithm
-
indexSize
- Specified by:
indexSize
in interfaceHFile.Reader
- Returns:
- the total heap size of data and meta block indexes in bytes. Does not take into account non-root blocks of a multilevel data index.
-
getName
Description copied from interface:HFile.Reader
Returns this reader's "name". Usually the last component of the path. Needs to be constant as the file is being moved to support caching on write.- Specified by:
getName
in interfaceHFile.Reader
-
setDataBlockEncoder
- Specified by:
setDataBlockEncoder
in interfaceHFile.Reader
-
setDataBlockIndexReader
- Specified by:
setDataBlockIndexReader
in interfaceHFile.Reader
-
getDataBlockIndexReader
- Specified by:
getDataBlockIndexReader
in interfaceHFile.Reader
-
setMetaBlockIndexReader
- Specified by:
setMetaBlockIndexReader
in interfaceHFile.Reader
-
getMetaBlockIndexReader
- Specified by:
getMetaBlockIndexReader
in interfaceHFile.Reader
-
getTrailer
- Specified by:
getTrailer
in interfaceHFile.Reader
-
getContext
- Specified by:
getContext
in interfaceHFile.Reader
-
getHFileInfo
- Specified by:
getHFileInfo
in interfaceHFile.Reader
-
isPrimaryReplicaReader
- Specified by:
isPrimaryReplicaReader
in interfaceHFile.Reader
-
getPath
- Specified by:
getPath
in interfaceHFile.Reader
-
getDataBlockEncoding
- Specified by:
getDataBlockEncoding
in interfaceHFile.Reader
-
getConf
- Specified by:
getConf
in interfaceorg.apache.hadoop.conf.Configurable
-
setConf
- Specified by:
setConf
in interfaceorg.apache.hadoop.conf.Configurable
-
getCachedBlock
private HFileBlock getCachedBlock(BlockCacheKey cacheKey, boolean cacheBlock, boolean useLock, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) throws IOException Retrieve block from cache. Validates the retrieved block's type vsexpectedBlockType
and its encoding vs.expectedDataBlockEncoding
. Unpacks the block as necessary.- Throws:
IOException
-
returnAndEvictBlock
-
getMetaBlock
- Specified by:
getMetaBlock
in interfaceHFile.Reader
- Parameters:
cacheBlock
- Add block to cache, if found- Returns:
- block wrapped in a ByteBuffer, with header skipped
- Throws:
IOException
-
shouldUseHeap
Whether we use heap or not depends on our intent to cache the block. We want to avoid allocating to off-heap if we intend to cache into the on-heap L1 cache. Otherwise, it's more efficient to allocate to off-heap since we can control GC ourselves for those. So our decision here breaks down as follows:
If block cache is disabled, don't use heap. If we're not using the CombinedBlockCache, use heap unless caching is disabled for the request. Otherwise, only use heap if caching is enabled and the expected block type is not DATA (which goes to off-heap L2 in combined cache). -
readBlock
public HFileBlock readBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) throws IOException Description copied from interface:HFile.CachingBlockReader
Read in a file block.- Specified by:
readBlock
in interfaceHFile.CachingBlockReader
- Parameters:
dataBlockOffset
- offset to read.onDiskBlockSize
- size of the blockisCompaction
- is this block being read as part of a compactionexpectedBlockType
- the block type we are expecting to read with this read operation, or null to read whatever block type is available and avoid checking (that might reduce caching efficiency of encoded data blocks)expectedDataBlockEncoding
- the data block encoding the caller is expecting data blocks to be in, or null to not perform this check and return the block irrespective of the encoding. This check only applies to data blocks and can be set to null when the caller is expecting to read a non-data block and has set expectedBlockType accordingly.- Returns:
- Block wrapped in a ByteBuffer.
- Throws:
IOException
-
readBlock
public HFileBlock readBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding, boolean cacheOnly) throws IOException - Specified by:
readBlock
in interfaceHFile.CachingBlockReader
- Throws:
IOException
-
hasMVCCInfo
- Specified by:
hasMVCCInfo
in interfaceHFile.Reader
-
validateBlockType
Compares the actual type of a block retrieved from cache or disk with its expected type and throws an exception in case of a mismatch. Expected block type ofBlockType.DATA
is considered to match the actual block type [@linkBlockType.ENCODED_DATA
as well.- Parameters:
block
- a block retrieved from cache or diskexpectedBlockType
- the expected block type, or null to skip the check- Throws:
IOException
-
getLastKey
- Specified by:
getLastKey
in interfaceHFile.Reader
- Returns:
- Last key as cell in the file. May be null if file has no entries. Note that this is not the last row key, but it is the Cell representation of the last key
-
midKey
- Specified by:
midKey
in interfaceHFile.Reader
- Returns:
- Midkey for this file. We work with block boundaries only so returned midkey is an approximation only.
- Throws:
IOException
-
close
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-
getEffectiveEncodingInCache
- Specified by:
getEffectiveEncodingInCache
in interfaceHFile.Reader
-
getUncachedBlockReader
For testing- Specified by:
getUncachedBlockReader
in interfaceHFile.Reader
-
getGeneralBloomFilterMetadata
Returns a buffer with the Bloom filter metadata. The caller takes ownership of the buffer.- Specified by:
getGeneralBloomFilterMetadata
in interfaceHFile.Reader
- Throws:
IOException
-
getDeleteBloomFilterMetadata
Description copied from interface:HFile.Reader
Retrieves delete family Bloom filter metadata as appropriate for eachHFile
version. Knows nothing about how that metadata is structured.- Specified by:
getDeleteBloomFilterMetadata
in interfaceHFile.Reader
- Throws:
IOException
-
getBloomFilterMetadata
- Throws:
IOException
-
isFileInfoLoaded
-
getFileContext
Description copied from interface:HFile.Reader
Return the file context of the HFile this reader belongs to- Specified by:
getFileContext
in interfaceHFile.Reader
-
prefetchComplete
Returns false if block prefetching was requested for this file and has not completed, true otherwise- Specified by:
prefetchComplete
in interfaceHFile.Reader
-
prefetchStarted
Returns true if block prefetching was started after waiting for specified delay, false otherwise- Specified by:
prefetchStarted
in interfaceHFile.Reader
-
getScanner
public HFileScanner getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread) Create a Scanner on this file. No seeks or reads are done on creation. CallHFileScanner.seekTo(ExtendedCell)
to position an start the read. There is nothing to clean up in a Scanner. Letting go of your references to the scanner is sufficient. NOTE: Do not use this overload of getScanner for compactions. SeegetScanner(Configuration, boolean, boolean, boolean)
- Specified by:
getScanner
in interfaceHFile.Reader
- Parameters:
conf
- Store configuration.cacheBlocks
- True if we should cache blocks read in by this scanner.pread
- Use positional read rather than seek+read if true (pread is better for random reads, seek+read is better scanning).- Returns:
- Scanner on this file.
-
getScanner
public HFileScanner getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread, boolean isCompaction) Create a Scanner on this file. No seeks or reads are done on creation. CallHFileScanner.seekTo(ExtendedCell)
to position an start the read. There is nothing to clean up in a Scanner. Letting go of your references to the scanner is sufficient.- Specified by:
getScanner
in interfaceHFile.Reader
- Parameters:
conf
- Store configuration.cacheBlocks
- True if we should cache blocks read in by this scanner.pread
- Use positional read rather than seek+read if true (pread is better for random reads, seek+read is better scanning).isCompaction
- is scanner being used for a compaction?- Returns:
- Scanner on this file.
-
getMajorVersion
-
unbufferStream
Description copied from interface:HFile.Reader
To close the stream's socket. Note: This can be concurrently called from multiple threads and implementation should take care of thread safety.- Specified by:
unbufferStream
in interfaceHFile.Reader
-