Package org.apache.hadoop.hbase.io.hfile
Class HFileBlock.FSReaderImpl
java.lang.Object
org.apache.hadoop.hbase.io.hfile.HFileBlock.FSReaderImpl
- All Implemented Interfaces:
HFileBlock.FSReader
- Enclosing class:
- HFileBlock
Reads version 2 HFile blocks from the filesystem.
-
Field Summary
Modifier and TypeFieldDescriptionprivate final ByteBuffAllocator
private final HFileBlockDefaultDecodingContext
Default context used when BlockType !=BlockType.ENCODED_DATA
.private HFileBlockDecodingContext
private HFileContext
private long
The size of the file we are reading from, or -1 if unknown.static final String
If reading block cost time in milliseconds more than the threshold, a warning will be logged.protected final int
The size of the headerprivate HFileSystem
The filesystem used to access dataprivate final boolean
private String
Cache of the NEXT header after this.private final long
private final Lock
private FSDataInputStreamWrapper
The file system stream of the underlyingHFile
that does or doesn't do checksum validations in the filesystem -
Constructor Summary
ConstructorDescriptionFSReaderImpl
(ReaderContext readerContext, HFileContext fileContext, ByteBuffAllocator allocator, org.apache.hadoop.conf.Configuration conf) -
Method Summary
Modifier and TypeMethodDescriptionprivate ByteBuff
allocate
(int size, boolean intoHeap) blockRange
(long startOffset, long endOffset) Creates a block iterator over the given portion of theHFile
.private void
cacheNextBlockHeader
(long offset, ByteBuff onDiskBlock, int onDiskSizeWithHeader, int headerLength) Save away the next blocks header in atomic reference.private boolean
checkCallerProvidedOnDiskSizeWithHeader
(long value) Check thatvalue
provided by the calling context seems reasonable, within a large margin of error.private boolean
checkOnDiskSizeWithHeader
(int value) Check thatvalue
read from a block header seems reasonable, within a large margin of error.void
Closes the backing streamsGet a decoder forBlockType.ENCODED_DATA
blocks from this file.private ByteBuff
getCachedHeader
(long offset) Check atomic reference cache for this block's header.Get the default decoder for blocks from this file.private int
getNextBlockOnDiskSize
(ByteBuff onDiskBlock, int onDiskSizeWithHeader) private void
Clear the cached value when its integrity is suspect.protected boolean
readAtOffset
(org.apache.hadoop.fs.FSDataInputStream istream, ByteBuff dest, int size, boolean peekIntoNextBlock, long fileOffset, boolean pread) Does a positional read or a seek and read into the given byte buffer.readBlockData
(long offset, long onDiskSizeWithHeaderL, boolean pread, boolean updateMetrics, boolean intoHeap) Reads a version 2 block (version 1 blocks not supported and not expected).protected HFileBlock
readBlockDataInternal
(org.apache.hadoop.fs.FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics, boolean intoHeap) Reads a version 2 block.void
setDataBlockEncoder
(HFileDataBlockEncoder encoder, org.apache.hadoop.conf.Configuration conf) void
setIncludesMemStoreTS
(boolean includesMemstoreTS) toString()
void
To close the stream's socket.private boolean
validateChecksum
(long offset, ByteBuff data, int hdrSize) Generates the checksum for the header as well as the data and then validates it.
-
Field Details
-
streamWrapper
The file system stream of the underlyingHFile
that does or doesn't do checksum validations in the filesystem -
encodedBlockDecodingCtx
-
defaultDecodingCtx
Default context used when BlockType !=BlockType.ENCODED_DATA
. -
prefetchedHeader
Cache of the NEXT header after this. Check it is indeed next blocks header before using it. TODO: Review. This overread into next block to fetch next blocks header seems unnecessary given we usually get the block size from the hfile index. Review! -
fileSize
The size of the file we are reading from, or -1 if unknown. -
hdrSize
The size of the header -
hfs
The filesystem used to access data -
fileContext
-
pathName
-
allocator
-
streamLock
-
isPreadAllBytes
-
readWarnTime
-
FS_READER_WARN_TIME_MS
If reading block cost time in milliseconds more than the threshold, a warning will be logged.- See Also:
-
-
Constructor Details
-
FSReaderImpl
FSReaderImpl(ReaderContext readerContext, HFileContext fileContext, ByteBuffAllocator allocator, org.apache.hadoop.conf.Configuration conf) throws IOException - Throws:
IOException
-
-
Method Details
-
blockRange
Description copied from interface:HFileBlock.FSReader
Creates a block iterator over the given portion of theHFile
. The iterator returns blocks starting with offset such that offset <= startOffset < endOffset. Returned blocks are always unpacked. Used when no hfile index available; e.g. reading in the hfile index blocks themselves on file open.- Specified by:
blockRange
in interfaceHFileBlock.FSReader
- Parameters:
startOffset
- the offset of the block to start iteration withendOffset
- the offset to end iteration at (exclusive)- Returns:
- an iterator of blocks between the two given offsets
-
readAtOffset
protected boolean readAtOffset(org.apache.hadoop.fs.FSDataInputStream istream, ByteBuff dest, int size, boolean peekIntoNextBlock, long fileOffset, boolean pread) throws IOException Does a positional read or a seek and read into the given byte buffer. We need take care that we will call theByteBuff.release()
for every exit to deallocate the ByteBuffers, otherwise the memory leak may happen.- Parameters:
dest
- destination buffersize
- size of readpeekIntoNextBlock
- whether to read the next block's on-disk sizefileOffset
- position in the stream to read atpread
- whether we should do a positional readistream
- The input source of data- Returns:
- true to indicate the destination buffer include the next block header, otherwise only include the current block data without the next block header.
- Throws:
IOException
- if any IO error happen.
-
readBlockData
public HFileBlock readBlockData(long offset, long onDiskSizeWithHeaderL, boolean pread, boolean updateMetrics, boolean intoHeap) throws IOException Reads a version 2 block (version 1 blocks not supported and not expected). Tries to do as little memory allocation as possible, using the provided on-disk size.- Specified by:
readBlockData
in interfaceHFileBlock.FSReader
- Parameters:
offset
- the offset in the stream to read atonDiskSizeWithHeaderL
- the on-disk size of the block, including the header, or -1 if unknown; i.e. when iterating over blocks reading in the file metadata info.pread
- whether to use a positional readupdateMetrics
- whether to update the metricsintoHeap
- allocate ByteBuff of block from heap or off-heap.- Returns:
- the newly read block
- Throws:
IOException
- See Also:
-
checkOnDiskSizeWithHeader
Check thatvalue
read from a block header seems reasonable, within a large margin of error.- Returns:
true
if the value is safe to proceed,false
otherwise.
-
checkCallerProvidedOnDiskSizeWithHeader
Check thatvalue
provided by the calling context seems reasonable, within a large margin of error.- Returns:
true
if the value is safe to proceed,false
otherwise.
-
getCachedHeader
Check atomic reference cache for this block's header. Cache only good if next read coming through is next in sequence in the block. We read next block's header on the tail of reading the previous block to save a seek. Otherwise, we have to do a seek to read the header before we can pull in the block OR we have to backup the stream because we over-read (the next block's header).- Returns:
- The cached block header or null if not found.
- See Also:
-
cacheNextBlockHeader
private void cacheNextBlockHeader(long offset, ByteBuff onDiskBlock, int onDiskSizeWithHeader, int headerLength) Save away the next blocks header in atomic reference. -
invalidateNextBlockHeader
Clear the cached value when its integrity is suspect. -
getNextBlockOnDiskSize
-
allocate
-
readBlockDataInternal
protected HFileBlock readBlockDataInternal(org.apache.hadoop.fs.FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics, boolean intoHeap) throws IOException Reads a version 2 block.- Parameters:
offset
- the offset in the stream to read at.onDiskSizeWithHeaderL
- the on-disk size of the block, including the header and checksums if present or -1 if unknown (as a long). Can be -1 if we are doing raw iteration of blocks as when loading up file metadata; i.e. the first read of a new file. Usually non-null gotten from the file index.pread
- whether to use a positional readverifyChecksum
- Whether to use HBase checksums. If HBase checksum is switched off, then use HDFS checksum. Can also flip on/off reading same file if we hit a troublesome patch in an hfile.updateMetrics
- whether need to update the metrics.intoHeap
- allocate the ByteBuff of block from heap or off-heap.- Returns:
- the HFileBlock or null if there is a HBase checksum mismatch
- Throws:
IOException
-
setIncludesMemStoreTS
- Specified by:
setIncludesMemStoreTS
in interfaceHFileBlock.FSReader
-
setDataBlockEncoder
public void setDataBlockEncoder(HFileDataBlockEncoder encoder, org.apache.hadoop.conf.Configuration conf) - Specified by:
setDataBlockEncoder
in interfaceHFileBlock.FSReader
-
getBlockDecodingContext
Description copied from interface:HFileBlock.FSReader
Get a decoder forBlockType.ENCODED_DATA
blocks from this file.- Specified by:
getBlockDecodingContext
in interfaceHFileBlock.FSReader
-
getDefaultBlockDecodingContext
Description copied from interface:HFileBlock.FSReader
Get the default decoder for blocks from this file.- Specified by:
getDefaultBlockDecodingContext
in interfaceHFileBlock.FSReader
-
validateChecksum
Generates the checksum for the header as well as the data and then validates it. If the block doesn't uses checksum, returns false.- Returns:
- True if checksum matches, else false.
-
closeStreams
Description copied from interface:HFileBlock.FSReader
Closes the backing streams- Specified by:
closeStreams
in interfaceHFileBlock.FSReader
- Throws:
IOException
-
unbufferStream
Description copied from interface:HFileBlock.FSReader
To close the stream's socket. Note: This can be concurrently called from multiple threads and implementation should take care of thread safety.- Specified by:
unbufferStream
in interfaceHFileBlock.FSReader
-
toString
-