Package org.apache.hadoop.hbase.io.hfile
Class HFileReaderImpl
java.lang.Object
org.apache.hadoop.hbase.io.hfile.HFileReaderImpl
- All Implemented Interfaces:
Closeable,AutoCloseable,org.apache.hadoop.conf.Configurable,HFile.CachingBlockReader,HFile.Reader
- Direct Known Subclasses:
HFilePreadReader,HFileStreamReader
@Private
public abstract class HFileReaderImpl
extends Object
implements HFile.Reader, org.apache.hadoop.conf.Configurable
Implementation that can handle all hfile versions of
HFile.Reader.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classprotected static classScanner that operates on encoded data blocks.static classstatic classAn exception thrown when an operation requiring a scanner to be seeked is invoked on a scanner that is not seeked. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final CacheConfigBlock cache configuration.private org.apache.hadoop.conf.Configurationprotected ReaderContextprotected HFileDataBlockEncoderWhat kind of data block encoding should be used while reading, writing, and handling cache.Data block index reader keeping the root data index in memoryprotected final HFileInfoprotected HFileBlock.FSReaderFilesystem-level block reader.protected HFileContextstatic final intThe size of a (key length, value length) tuple that prefixes each entry in a data block.private static final org.slf4j.Logger(package private) static final intMaximum minor version supported by this HFile formatMeta block index reader -- always single level(package private) static final intMinimum minor version supported by this HFile formatstatic final intIn HFile minor version that does not support checksumsstatic final intMinor versions in HFile starting with this number have hbase checksums(package private) static final intMinor versions starting with this number have faked index keyprotected final StringFile name to be used for block namesprivate IdLockA "sparse lock" implementation allowing to lock on a particular block identified by offset.protected final org.apache.hadoop.fs.PathPath of filestatic final intHFile minor version that introduced pbuf filetrailerprivate final booleanprotected FixedFileTrailer -
Constructor Summary
ConstructorsConstructorDescriptionHFileReaderImpl(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf) Opens a HFile. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()private DataInputgetBloomFilterMetadata(BlockType blockType) getCachedBlock(BlockCacheKey cacheKey, boolean cacheBlock, boolean useLock, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) Retrieve block from cache.Returns comparatororg.apache.hadoop.conf.ConfigurationgetConf()Retrieves delete family Bloom filter metadata as appropriate for eachHFileversion.getEffectiveEncodingInCache(boolean isCompaction) longReturns number of KV entries in this HFileReturn the file context of the HFile this reader belongs toOptional<byte[]>Returns a buffer with the Bloom filter metadata.Optional<byte[]>intgetMetaBlock(String metaBlockName, boolean cacheBlock) getName()Returns this reader's "name".org.apache.hadoop.fs.PathgetPath()getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread) Create a Scanner on this file.getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread, boolean isCompaction) Create a Scanner on this file.For testingbooleanlongbooleanbooleanlonglength()midKey()booleanReturns false if block prefetching was requested for this file and has not completed, true otherwisebooleanReturns true if block prefetching was started after waiting for specified delay, false otherwisereadBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) Read in a file block.readBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding, boolean cacheOnly) private voidreturnAndEvictBlock(BlockCache cache, BlockCacheKey cacheKey, Cacheable block) voidsetConf(org.apache.hadoop.conf.Configuration conf) voidsetDataBlockEncoder(HFileDataBlockEncoder dataBlockEncoder) voidvoidprivate booleanshouldUseHeap(BlockType expectedBlockType, boolean cacheBlock) Whether we use heap or not depends on our intent to cache the block.toString()voidTo close the stream's socket.private voidvalidateBlockType(HFileBlock block, BlockType expectedBlockType) Compares the actual type of a block retrieved from cache or disk with its expected type and throws an exception in case of a mismatch.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.hadoop.hbase.io.hfile.HFile.Reader
close
-
Field Details
-
LOG
-
dataBlockIndexReader
Data block index reader keeping the root data index in memory -
metaBlockIndexReader
Meta block index reader -- always single level -
trailer
-
-
dataBlockEncoder
What kind of data block encoding should be used while reading, writing, and handling cache. -
cacheConf
Block cache configuration. -
context
-
fileInfo
-
path
Path of file -
name
File name to be used for block names -
conf
-
hfileContext
-
fsBlockReader
Filesystem-level block reader. -
offsetLock
A "sparse lock" implementation allowing to lock on a particular block identified by offset. The purpose of this is to avoid two clients loading the same block, and have all but one client wait to get the block from the cache. -
MIN_MINOR_VERSION
Minimum minor version supported by this HFile format- See Also:
-
MAX_MINOR_VERSION
Maximum minor version supported by this HFile format- See Also:
-
MINOR_VERSION_WITH_FAKED_KEY
Minor versions starting with this number have faked index key- See Also:
-
MINOR_VERSION_WITH_CHECKSUM
Minor versions in HFile starting with this number have hbase checksums- See Also:
-
MINOR_VERSION_NO_CHECKSUM
In HFile minor version that does not support checksums- See Also:
-
PBUF_TRAILER_MINOR_VERSION
HFile minor version that introduced pbuf filetrailer- See Also:
-
KEY_VALUE_LEN_SIZE
The size of a (key length, value length) tuple that prefixes each entry in a data block.- See Also:
-
-
Constructor Details
-
HFileReaderImpl
public HFileReaderImpl(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf) throws IOException Opens a HFile.- Parameters:
context- Reader context infofileInfo- HFile infocacheConf- Cache configuration.conf- Configuration- Throws:
IOException
-
-
Method Details
-
getCacheConf
-
toStringFirstKey
-
toStringLastKey
-
toString
-
length
- Specified by:
lengthin interfaceHFile.Reader
-
getFirstKey
- Specified by:
getFirstKeyin interfaceHFile.Reader- Returns:
- the first key in the file. May be null if file has no entries. Note that this is not the first row key, but rather the byte form of the first KeyValue.
-
getFirstRowKey
TODO left fromHFileversion 1: move this to StoreFile after Ryan's patch goes in to eliminateKeyValuehere.- Specified by:
getFirstRowKeyin interfaceHFile.Reader- Returns:
- the first row key, or null if the file is empty.
-
getLastRowKey
TODO left fromHFileversion 1: move this to StoreFile after Ryan's patch goes in to eliminateKeyValuehere.- Specified by:
getLastRowKeyin interfaceHFile.Reader- Returns:
- the last row key, or null if the file is empty.
-
getEntries
Returns number of KV entries in this HFile- Specified by:
getEntriesin interfaceHFile.Reader
-
getComparator
Returns comparator- Specified by:
getComparatorin interfaceHFile.Reader
-
getCompressionAlgorithm
-
indexSize
- Specified by:
indexSizein interfaceHFile.Reader- Returns:
- the total heap size of data and meta block indexes in bytes. Does not take into account non-root blocks of a multilevel data index.
-
getName
Description copied from interface:HFile.ReaderReturns this reader's "name". Usually the last component of the path. Needs to be constant as the file is being moved to support caching on write.- Specified by:
getNamein interfaceHFile.Reader
-
setDataBlockEncoder
- Specified by:
setDataBlockEncoderin interfaceHFile.Reader
-
setDataBlockIndexReader
- Specified by:
setDataBlockIndexReaderin interfaceHFile.Reader
-
getDataBlockIndexReader
- Specified by:
getDataBlockIndexReaderin interfaceHFile.Reader
-
setMetaBlockIndexReader
- Specified by:
setMetaBlockIndexReaderin interfaceHFile.Reader
-
getMetaBlockIndexReader
- Specified by:
getMetaBlockIndexReaderin interfaceHFile.Reader
-
getTrailer
- Specified by:
getTrailerin interfaceHFile.Reader
-
getContext
- Specified by:
getContextin interfaceHFile.Reader
-
getHFileInfo
- Specified by:
getHFileInfoin interfaceHFile.Reader
-
isPrimaryReplicaReader
- Specified by:
isPrimaryReplicaReaderin interfaceHFile.Reader
-
getPath
- Specified by:
getPathin interfaceHFile.Reader
-
getDataBlockEncoding
- Specified by:
getDataBlockEncodingin interfaceHFile.Reader
-
getConf
- Specified by:
getConfin interfaceorg.apache.hadoop.conf.Configurable
-
setConf
- Specified by:
setConfin interfaceorg.apache.hadoop.conf.Configurable
-
getCachedBlock
@LimitedPrivate("Unittest") public HFileBlock getCachedBlock(BlockCacheKey cacheKey, boolean cacheBlock, boolean useLock, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) throws IOException Retrieve block from cache. Validates the retrieved block's type vsexpectedBlockTypeand its encoding vs.expectedDataBlockEncoding. Unpacks the block as necessary.- Throws:
IOException
-
returnAndEvictBlock
-
getMetaBlock
- Specified by:
getMetaBlockin interfaceHFile.Reader- Parameters:
cacheBlock- Add block to cache, if found- Returns:
- block wrapped in a ByteBuffer, with header skipped
- Throws:
IOException
-
shouldUseHeap
Whether we use heap or not depends on our intent to cache the block. We want to avoid allocating to off-heap if we intend to cache into the on-heap L1 cache. Otherwise, it's more efficient to allocate to off-heap since we can control GC ourselves for those. So our decision here breaks down as follows:
If block cache is disabled, don't use heap. If we're not using the CombinedBlockCache, use heap unless caching is disabled for the request. Otherwise, only use heap if caching is enabled and the expected block type is not DATA (which goes to off-heap L2 in combined cache). -
readBlock
public HFileBlock readBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) throws IOException Description copied from interface:HFile.CachingBlockReaderRead in a file block.- Specified by:
readBlockin interfaceHFile.CachingBlockReader- Parameters:
dataBlockOffset- offset to read.onDiskBlockSize- size of the blockisCompaction- is this block being read as part of a compactionexpectedBlockType- the block type we are expecting to read with this read operation, or null to read whatever block type is available and avoid checking (that might reduce caching efficiency of encoded data blocks)expectedDataBlockEncoding- the data block encoding the caller is expecting data blocks to be in, or null to not perform this check and return the block irrespective of the encoding. This check only applies to data blocks and can be set to null when the caller is expecting to read a non-data block and has set expectedBlockType accordingly.- Returns:
- Block wrapped in a ByteBuffer.
- Throws:
IOException
-
readBlock
public HFileBlock readBlock(long dataBlockOffset, long onDiskBlockSize, boolean cacheBlock, boolean pread, boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding, boolean cacheOnly) throws IOException - Specified by:
readBlockin interfaceHFile.CachingBlockReader- Throws:
IOException
-
hasMVCCInfo
- Specified by:
hasMVCCInfoin interfaceHFile.Reader
-
validateBlockType
Compares the actual type of a block retrieved from cache or disk with its expected type and throws an exception in case of a mismatch. Expected block type ofBlockType.DATAis considered to match the actual block type [@linkBlockType.ENCODED_DATAas well.- Parameters:
block- a block retrieved from cache or diskexpectedBlockType- the expected block type, or null to skip the check- Throws:
IOException
-
getLastKey
- Specified by:
getLastKeyin interfaceHFile.Reader- Returns:
- Last key as cell in the file. May be null if file has no entries. Note that this is not the last row key, but it is the Cell representation of the last key
-
midKey
- Specified by:
midKeyin interfaceHFile.Reader- Returns:
- Midkey for this file. We work with block boundaries only so returned midkey is an approximation only.
- Throws:
IOException
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
getEffectiveEncodingInCache
- Specified by:
getEffectiveEncodingInCachein interfaceHFile.Reader
-
getUncachedBlockReader
For testing- Specified by:
getUncachedBlockReaderin interfaceHFile.Reader
-
getGeneralBloomFilterMetadata
Returns a buffer with the Bloom filter metadata. The caller takes ownership of the buffer.- Specified by:
getGeneralBloomFilterMetadatain interfaceHFile.Reader- Throws:
IOException
-
getDeleteBloomFilterMetadata
Description copied from interface:HFile.ReaderRetrieves delete family Bloom filter metadata as appropriate for eachHFileversion. Knows nothing about how that metadata is structured.- Specified by:
getDeleteBloomFilterMetadatain interfaceHFile.Reader- Throws:
IOException
-
getBloomFilterMetadata
- Throws:
IOException
-
isFileInfoLoaded
-
getFileContext
Description copied from interface:HFile.ReaderReturn the file context of the HFile this reader belongs to- Specified by:
getFileContextin interfaceHFile.Reader
-
prefetchComplete
Returns false if block prefetching was requested for this file and has not completed, true otherwise- Specified by:
prefetchCompletein interfaceHFile.Reader
-
prefetchStarted
Returns true if block prefetching was started after waiting for specified delay, false otherwise- Specified by:
prefetchStartedin interfaceHFile.Reader
-
getScanner
public HFileScanner getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread) Create a Scanner on this file. No seeks or reads are done on creation. CallHFileScanner.seekTo(ExtendedCell)to position an start the read. There is nothing to clean up in a Scanner. Letting go of your references to the scanner is sufficient. NOTE: Do not use this overload of getScanner for compactions. SeegetScanner(Configuration, boolean, boolean, boolean)- Specified by:
getScannerin interfaceHFile.Reader- Parameters:
conf- Store configuration.cacheBlocks- True if we should cache blocks read in by this scanner.pread- Use positional read rather than seek+read if true (pread is better for random reads, seek+read is better scanning).- Returns:
- Scanner on this file.
-
getScanner
public HFileScanner getScanner(org.apache.hadoop.conf.Configuration conf, boolean cacheBlocks, boolean pread, boolean isCompaction) Create a Scanner on this file. No seeks or reads are done on creation. CallHFileScanner.seekTo(ExtendedCell)to position an start the read. There is nothing to clean up in a Scanner. Letting go of your references to the scanner is sufficient.- Specified by:
getScannerin interfaceHFile.Reader- Parameters:
conf- Store configuration.cacheBlocks- True if we should cache blocks read in by this scanner.pread- Use positional read rather than seek+read if true (pread is better for random reads, seek+read is better scanning).isCompaction- is scanner being used for a compaction?- Returns:
- Scanner on this file.
-
getMajorVersion
-
unbufferStream
Description copied from interface:HFile.ReaderTo close the stream's socket. Note: This can be concurrently called from multiple threads and implementation should take care of thread safety.- Specified by:
unbufferStreamin interfaceHFile.Reader
-