java.lang.Object

org.apache.hadoop.hbase.io.hfile.HFileBlock.FSReaderImpl

All Implemented Interfaces:: HFileBlock.FSReader

Enclosing class:: HFileBlock

static class HFileBlock.FSReaderImpl extends Object implements HFileBlock.FSReader

Reads version 2 HFile blocks from the filesystem.

Field Summary

Fields

Modifier and Type

Field

Description

private final ByteBuffAllocator

allocator

private final HFileBlockDefaultDecodingContext

defaultDecodingCtx

Default context used when BlockType != BlockType.ENCODED_DATA.

private HFileBlockDecodingContext

encodedBlockDecodingCtx

private HFileContext

fileContext

private long

fileSize

The size of the file we are reading from, or -1 if unknown.

static final String

FS_READER_WARN_TIME_MS

If reading block cost time in milliseconds more than the threshold, a warning will be logged.

protected final int

hdrSize

The size of the header

private HFileSystem

hfs

The filesystem used to access data

private final boolean

isPreadAllBytes

private String

pathName

private AtomicReference<HFileBlock.PrefetchedHeader>

prefetchedHeader

Cache of the NEXT header after this.

private final long

readWarnTime

private final Lock

streamLock

private FSDataInputStreamWrapper

streamWrapper

The file system stream of the underlying HFile that does or doesn't do checksum validations in the filesystem
Constructor Summary

Constructors

Constructor

Description

FSReaderImpl(ReaderContext readerContext, HFileContext fileContext, ByteBuffAllocator allocator, org.apache.hadoop.conf.Configuration conf)
Method Summary

Modifier and Type

Method

Description

private ByteBuff

allocate(int size, boolean intoHeap)

HFileBlock.BlockIterator

blockRange(long startOffset, long endOffset)

Creates a block iterator over the given portion of the HFile.

private void

cacheNextBlockHeader(long offset, ByteBuff onDiskBlock, int onDiskSizeWithHeader, int headerLength)

Save away the next blocks header in atomic reference.

private boolean

checkCallerProvidedOnDiskSizeWithHeader(long value)

Check that value provided by the calling context seems reasonable, within a large margin of error.

private boolean

checkCheckSumTypeOnHeaderBuf(ByteBuff headerBuf)

Check that checksumType on headerBuf read from a block header seems reasonable, within the known value range.

private boolean

checkOnDiskSizeWithHeader(int value)

Check that value read from a block header seems reasonable, within a large margin of error.

void

closeStreams()

Closes the backing streams

HFileBlockDecodingContext

getBlockDecodingContext()

Get a decoder for BlockType.ENCODED_DATA blocks from this file.

private ByteBuff

getCachedHeader(long offset)

Check atomic reference cache for this block's header.

HFileBlockDecodingContext

getDefaultBlockDecodingContext()

Get the default decoder for blocks from this file.

private int

getNextBlockOnDiskSize(ByteBuff onDiskBlock, int onDiskSizeWithHeader)

private void

invalidateNextBlockHeader()

Clear the cached value when its integrity is suspect.

protected boolean

readAtOffset(org.apache.hadoop.fs.FSDataInputStream istream, ByteBuff dest, int size, boolean peekIntoNextBlock, long fileOffset, boolean pread)

Does a positional read or a seek and read into the given byte buffer.

HFileBlock

readBlockData(long offset, long onDiskSizeWithHeaderL, boolean pread, boolean updateMetrics, boolean intoHeap)

Reads a version 2 block (version 1 blocks not supported and not expected).

protected HFileBlock

readBlockDataInternal(org.apache.hadoop.fs.FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics, boolean intoHeap)

Reads a version 2 block.

void

setDataBlockEncoder(HFileDataBlockEncoder encoder, org.apache.hadoop.conf.Configuration conf)

void

setIncludesMemStoreTS(boolean includesMemstoreTS)

String

toString()

void

unbufferStream()

To close the stream's socket.

private boolean

validateChecksum(long offset, ByteBuff data, int hdrSize)

Generates the checksum for the header as well as the data and then validates it.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- streamWrapper
  
  private FSDataInputStreamWrapper streamWrapper
  
  The file system stream of the underlying HFile that does or doesn't do checksum validations in the filesystem
- encodedBlockDecodingCtx
  
  private HFileBlockDecodingContext encodedBlockDecodingCtx
- defaultDecodingCtx
  
  private final HFileBlockDefaultDecodingContext defaultDecodingCtx
  
  Default context used when BlockType != BlockType.ENCODED_DATA.
- prefetchedHeader
  
  private AtomicReference<HFileBlock.PrefetchedHeader> prefetchedHeader
  
  Cache of the NEXT header after this. Check it is indeed next blocks header before using it. TODO: Review. This overread into next block to fetch next blocks header seems unnecessary given we usually get the block size from the hfile index. Review!
- fileSize
  
  private long fileSize
  
  The size of the file we are reading from, or -1 if unknown.
- hdrSize
  
  protected final int hdrSize
  
  The size of the header
- hfs
  
  private HFileSystem hfs
  
  The filesystem used to access data
- fileContext
  
  private HFileContext fileContext
- pathName
  
  private String pathName
- allocator
  
  private final ByteBuffAllocator allocator
- streamLock
  
  private final Lock streamLock
- isPreadAllBytes
  
  private final boolean isPreadAllBytes
- readWarnTime
  
  private final long readWarnTime
- FS_READER_WARN_TIME_MS
  
  public static final String FS_READER_WARN_TIME_MS
  
  If reading block cost time in milliseconds more than the threshold, a warning will be logged.
  See Also:
  
  Constant Field Values
Constructor Details
- FSReaderImpl
  
  FSReaderImpl(ReaderContext readerContext, HFileContext fileContext, ByteBuffAllocator allocator, org.apache.hadoop.conf.Configuration conf) throws IOException
  
  Throws:
  
  IOException
Method Details
- blockRange
  
  public HFileBlock.BlockIterator blockRange(long startOffset, long endOffset)
  
  Description copied from interface: HFileBlock.FSReader
  
  Creates a block iterator over the given portion of the HFile. The iterator returns blocks starting with offset such that offset <= startOffset < endOffset. Returned blocks are always unpacked. Used when no hfile index available; e.g. reading in the hfile index blocks themselves on file open.
  
  Specified by:
  
  blockRange in interface HFileBlock.FSReader
  
  Parameters:
  
  startOffset - the offset of the block to start iteration with
  
  endOffset - the offset to end iteration at (exclusive)
  
  Returns:
  
  an iterator of blocks between the two given offsets
- readAtOffset
  
  protected boolean readAtOffset(org.apache.hadoop.fs.FSDataInputStream istream, ByteBuff dest, int size, boolean peekIntoNextBlock, long fileOffset, boolean pread) throws IOException
  
  Does a positional read or a seek and read into the given byte buffer. We need take care that we will call the ByteBuff.release() for every exit to deallocate the ByteBuffers, otherwise the memory leak may happen.
  
  Parameters:
  
  dest - destination buffer
  
  size - size of read
  
  peekIntoNextBlock - whether to read the next block's on-disk size
  
  fileOffset - position in the stream to read at
  
  pread - whether we should do a positional read
  
  istream - The input source of data
  
  Returns:
  
  true to indicate the destination buffer include the next block header, otherwise only include the current block data without the next block header.
  
  Throws:
  
  IOException - if any IO error happen.
- readBlockData
  
  public HFileBlock readBlockData(long offset, long onDiskSizeWithHeaderL, boolean pread, boolean updateMetrics, boolean intoHeap) throws IOException
  
  Reads a version 2 block (version 1 blocks not supported and not expected). Tries to do as little memory allocation as possible, using the provided on-disk size.
  Specified by:
  
  readBlockData in interface HFileBlock.FSReader
  
  Parameters:
  
  offset - the offset in the stream to read at
  
  onDiskSizeWithHeaderL - the on-disk size of the block, including the header, or -1 if unknown; i.e. when iterating over blocks reading in the file metadata info.
  
  pread - whether to use a positional read
  
  updateMetrics - whether to update the metrics
  
  intoHeap - allocate ByteBuff of block from heap or off-heap.
  
  Returns:
  
  the newly read block
  
  Throws:
  
  IOException
  
  See Also:
  
  for more details about the useHeap.
- checkCheckSumTypeOnHeaderBuf
  
  private boolean checkCheckSumTypeOnHeaderBuf(ByteBuff headerBuf)
  
  Check that checksumType on headerBuf read from a block header seems reasonable, within the known value range.
  
  Returns:
  
  true if the headerBuf is safe to proceed, false otherwise.
- checkOnDiskSizeWithHeader
  
  private boolean checkOnDiskSizeWithHeader(int value)
  
  Check that value read from a block header seems reasonable, within a large margin of error.
  
  Returns:
  
  true if the value is safe to proceed, false otherwise.
- checkCallerProvidedOnDiskSizeWithHeader
  
  private boolean checkCallerProvidedOnDiskSizeWithHeader(long value)
  
  Check that value provided by the calling context seems reasonable, within a large margin of error.
  
  Returns:
  
  true if the value is safe to proceed, false otherwise.
- getCachedHeader
  
  private ByteBuff getCachedHeader(long offset)
  
  Check atomic reference cache for this block's header. Cache only good if next read coming through is next in sequence in the block. We read next block's header on the tail of reading the previous block to save a seek. Otherwise, we have to do a seek to read the header before we can pull in the block OR we have to backup the stream because we over-read (the next block's header).
  Returns:
  
  The cached block header or null if not found.
  
  See Also:
  
  HFileBlock.PrefetchedHeader
  
  cacheNextBlockHeader(long, ByteBuff, int, int)
- cacheNextBlockHeader
  
  private void cacheNextBlockHeader(long offset, ByteBuff onDiskBlock, int onDiskSizeWithHeader, int headerLength)
  
  Save away the next blocks header in atomic reference.
  See Also:
  
  getCachedHeader(long)
  
  HFileBlock.PrefetchedHeader
- invalidateNextBlockHeader
  
  private void invalidateNextBlockHeader()
  
  Clear the cached value when its integrity is suspect.
- getNextBlockOnDiskSize
  
  private int getNextBlockOnDiskSize(ByteBuff onDiskBlock, int onDiskSizeWithHeader)
- allocate
  
  private ByteBuff allocate(int size, boolean intoHeap)
- readBlockDataInternal
  
  protected HFileBlock readBlockDataInternal(org.apache.hadoop.fs.FSDataInputStream is, long offset, long onDiskSizeWithHeaderL, boolean pread, boolean verifyChecksum, boolean updateMetrics, boolean intoHeap) throws IOException
  
  Reads a version 2 block.
  
  Parameters:
  
  offset - the offset in the stream to read at.
  
  onDiskSizeWithHeaderL - the on-disk size of the block, including the header and checksums if present or -1 if unknown (as a long). Can be -1 if we are doing raw iteration of blocks as when loading up file metadata; i.e. the first read of a new file. Usually non-null gotten from the file index.
  
  pread - whether to use a positional read
  
  verifyChecksum - Whether to use HBase checksums. If HBase checksum is switched off, then use HDFS checksum. Can also flip on/off reading same file if we hit a troublesome patch in an hfile.
  
  updateMetrics - whether need to update the metrics.
  
  intoHeap - allocate the ByteBuff of block from heap or off-heap.
  
  Returns:
  
  the HFileBlock or null if there is a HBase checksum mismatch
  
  Throws:
  
  IOException
- setIncludesMemStoreTS
  
  public void setIncludesMemStoreTS(boolean includesMemstoreTS)
  
  Specified by:
  
  setIncludesMemStoreTS in interface HFileBlock.FSReader
- setDataBlockEncoder
  
  public void setDataBlockEncoder(HFileDataBlockEncoder encoder, org.apache.hadoop.conf.Configuration conf)
  
  Specified by:
  
  setDataBlockEncoder in interface HFileBlock.FSReader
- getBlockDecodingContext
  
  public HFileBlockDecodingContext getBlockDecodingContext()
  
  Description copied from interface: HFileBlock.FSReader
  
  Get a decoder for BlockType.ENCODED_DATA blocks from this file.
  
  Specified by:
  
  getBlockDecodingContext in interface HFileBlock.FSReader
- getDefaultBlockDecodingContext
  
  public HFileBlockDecodingContext getDefaultBlockDecodingContext()
  
  Description copied from interface: HFileBlock.FSReader
  
  Get the default decoder for blocks from this file.
  
  Specified by:
  
  getDefaultBlockDecodingContext in interface HFileBlock.FSReader
- validateChecksum
  
  private boolean validateChecksum(long offset, ByteBuff data, int hdrSize)
  
  Generates the checksum for the header as well as the data and then validates it. If the block doesn't uses checksum, returns false.
  
  Returns:
  
  True if checksum matches, else false.
- closeStreams
  
  public void closeStreams() throws IOException
  
  Description copied from interface: HFileBlock.FSReader
  
  Closes the backing streams
  
  Specified by:
  
  closeStreams in interface HFileBlock.FSReader
  
  Throws:
  
  IOException
- unbufferStream
  
  public void unbufferStream()
  
  Description copied from interface: HFileBlock.FSReader
  
  To close the stream's socket. Note: This can be concurrently called from multiple threads and implementation should take care of thread safety.
  
  Specified by:
  
  unbufferStream in interface HFileBlock.FSReader
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object

Class HFileBlock.FSReaderImpl

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

streamWrapper

encodedBlockDecodingCtx

defaultDecodingCtx

prefetchedHeader

fileSize

hdrSize

hfs

fileContext

pathName

allocator

streamLock

isPreadAllBytes

readWarnTime

FS_READER_WARN_TIME_MS

Constructor Details

FSReaderImpl

Method Details

blockRange

readAtOffset

readBlockData

checkCheckSumTypeOnHeaderBuf

checkOnDiskSizeWithHeader

checkCallerProvidedOnDiskSizeWithHeader

getCachedHeader

cacheNextBlockHeader

invalidateNextBlockHeader

getNextBlockOnDiskSize

allocate

readBlockDataInternal

setIncludesMemStoreTS

setDataBlockEncoder

getBlockDecodingContext

getDefaultBlockDecodingContext

validateChecksum

closeStreams

unbufferStream

toString