java.lang.Object

org.apache.hadoop.hbase.io.FSDataInputStreamWrapper

All Implemented Interfaces:: Closeable, AutoCloseable

@Private public class FSDataInputStreamWrapper extends Object implements Closeable

Wrapper for input stream(s) that takes care of the interaction of FS and HBase checksums, as well as closing streams. Initialization is not thread-safe, but normal operation is; see method comments.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

private static class

FSDataInputStreamWrapper.ReadStatistics
Field Summary

Fields

Modifier and Type

Field

Description

private final boolean

doCloseStreams

private final boolean

dropBehind

private AtomicInteger

hbaseChecksumOffCount

private final HFileSystem

hfs

private final FileLink

link

private final org.apache.hadoop.fs.Path

path

private final long

readahead

protected org.apache.hadoop.fs.Path

readerPath

private static final FSDataInputStreamWrapper.ReadStatistics

readStatistics

private org.apache.hadoop.fs.FSDataInputStream

stream

Two stream handles, one with and one without FS-level checksum.

private org.apache.hadoop.fs.FSDataInputStream

streamNoFsChecksum

private final Object

streamNoFsChecksumFirstCreateLock

private boolean

useHBaseChecksum

private boolean

useHBaseChecksumConfigured
Constructor Summary

Constructors

Modifier

Constructor

Description

FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)

FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean dropBehind, long readahead)

FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, FileLink link, boolean dropBehind, long readahead)

private

FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, FileLink link, org.apache.hadoop.fs.Path path, boolean dropBehind, long readahead)

FSDataInputStreamWrapper(org.apache.hadoop.fs.FSDataInputStream fsdis)

For use in tests.

FSDataInputStreamWrapper(org.apache.hadoop.fs.FSDataInputStream fsdis, org.apache.hadoop.fs.FSDataInputStream noChecksum)

For use in tests.
Method Summary

Modifier and Type

Method

Description

void

checksumOk()

Report that checksum was ok, so we may ponder going back to HBase checksum.

void

close()

CloseClose stream(s) if necessary.

org.apache.hadoop.fs.FSDataInputStream

fallbackToFsChecksum(int offCount)

Read from non-checksum stream failed, fall back to FS checksum.

HFileSystem

getHfs()

static long

getLocalBytesRead()

org.apache.hadoop.fs.Path

getReaderPath()

static long

getShortCircuitBytesRead()

org.apache.hadoop.fs.FSDataInputStream

getStream(boolean useHBaseChecksum)

Get the stream to use.

static long

getTotalBytesRead()

static long

getZeroCopyBytesRead()

void

prepareForBlockReader(boolean forceNoHBaseChecksum)

Prepares the streams for block reader.

(package private) void

setShouldUseHBaseChecksum()

private void

setStreamOptions(org.apache.hadoop.fs.FSDataInputStream in)

boolean

shouldUseHBaseChecksum()

Returns Whether we are presently using HBase checksum.

void

unbuffer()

This will free sockets and file descriptors held by the stream only when the stream implements org.apache.hadoop.fs.CanUnbuffer.

private void

updateInputStreamStatistics(org.apache.hadoop.fs.FSDataInputStream stream)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- hfs
  
  private final HFileSystem hfs
- path
  
  private final org.apache.hadoop.fs.Path path
- link
  
  private final FileLink link
- doCloseStreams
  
  private final boolean doCloseStreams
- dropBehind
  
  private final boolean dropBehind
- readahead
  
  private final long readahead
- stream
  
  private volatile org.apache.hadoop.fs.FSDataInputStream stream
  
  Two stream handles, one with and one without FS-level checksum. HDFS checksum setting is on FS level, not single read level, so you have to keep two FS objects and two handles open to interleave different reads freely, which is very sad. This is what we do: 1) First, we need to read the trailer of HFile to determine checksum parameters. We always use FS checksum to do that, so ctor opens stream. 2.1) After that, if HBase checksum is not used, we'd just always use stream; 2.2) If HBase checksum can be used, we'll open streamNoFsChecksum, and close stream. User MUST call prepareForBlockReader for that to happen; if they don't, (2.1) will be the default. 3) The users can call shouldUseHBaseChecksum(), and pass its result to getStream(boolean) to get stream (if Java had out/pointer params we could return both in one call). This stream is guaranteed to be set. 4) The first time HBase checksum fails, one would call fallbackToFsChecksum(int). That will take lock, and open stream. While this is going on, others will continue to use the old stream; if they also want to fall back, they'll also call fallbackToFsChecksum(int), and block until stream is set. 5) After some number of checksumOk() calls, we will go back to using HBase checksum. We will have 2 handles; however we presume checksums fail so rarely that we don't care.
- streamNoFsChecksum
  
  private volatile org.apache.hadoop.fs.FSDataInputStream streamNoFsChecksum
- streamNoFsChecksumFirstCreateLock
  
  private final Object streamNoFsChecksumFirstCreateLock
- useHBaseChecksumConfigured
  
  private boolean useHBaseChecksumConfigured
- useHBaseChecksum
  
  private volatile boolean useHBaseChecksum
- hbaseChecksumOffCount
  
  private AtomicInteger hbaseChecksumOffCount
- readStatistics
  
  private static final FSDataInputStreamWrapper.ReadStatistics readStatistics
- readerPath
  
  protected org.apache.hadoop.fs.Path readerPath
Constructor Details
- FSDataInputStreamWrapper
  
  public FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException
  
  Throws:
  
  IOException
- FSDataInputStreamWrapper
  
  public FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean dropBehind, long readahead) throws IOException
  
  Throws:
  
  IOException
- FSDataInputStreamWrapper
  
  public FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, FileLink link, boolean dropBehind, long readahead) throws IOException
  
  Throws:
  
  IOException
- FSDataInputStreamWrapper
  
  private FSDataInputStreamWrapper(org.apache.hadoop.fs.FileSystem fs, FileLink link, org.apache.hadoop.fs.Path path, boolean dropBehind, long readahead) throws IOException
  
  Throws:
  
  IOException
- FSDataInputStreamWrapper
  
  public FSDataInputStreamWrapper(org.apache.hadoop.fs.FSDataInputStream fsdis)
  
  For use in tests.
- FSDataInputStreamWrapper
  
  public FSDataInputStreamWrapper(org.apache.hadoop.fs.FSDataInputStream fsdis, org.apache.hadoop.fs.FSDataInputStream noChecksum)
  
  For use in tests.
Method Details
- setStreamOptions
  
  private void setStreamOptions(org.apache.hadoop.fs.FSDataInputStream in)
- prepareForBlockReader
  
  public void prepareForBlockReader(boolean forceNoHBaseChecksum) throws IOException
  
  Prepares the streams for block reader. NOT THREAD SAFE. Must be called once, after any reads finish and before any other reads start (what happens in reality is we read the tail, then call this based on what's in the tail, then read blocks).
  
  Parameters:
  
  forceNoHBaseChecksum - Force not using HBase checksum.
  
  Throws:
  
  IOException
- shouldUseHBaseChecksum
  
  public boolean shouldUseHBaseChecksum()
  
  Returns Whether we are presently using HBase checksum.
- getStream
  
  public org.apache.hadoop.fs.FSDataInputStream getStream(boolean useHBaseChecksum)
  
  Get the stream to use. Thread-safe.
  
  Parameters:
  
  useHBaseChecksum - must be the value that shouldUseHBaseChecksum has returned at some point in the past, otherwise the result is undefined.
- fallbackToFsChecksum
  
  public org.apache.hadoop.fs.FSDataInputStream fallbackToFsChecksum(int offCount) throws IOException
  
  Read from non-checksum stream failed, fall back to FS checksum. Thread-safe.
  
  Parameters:
  
  offCount - For how many checksumOk calls to turn off the HBase checksum.
  
  Throws:
  
  IOException
- checksumOk
  
  public void checksumOk()
  
  Report that checksum was ok, so we may ponder going back to HBase checksum.
- updateInputStreamStatistics
  
  private void updateInputStreamStatistics(org.apache.hadoop.fs.FSDataInputStream stream)
- getTotalBytesRead
  
  public static long getTotalBytesRead()
- getLocalBytesRead
  
  public static long getLocalBytesRead()
- getShortCircuitBytesRead
  
  public static long getShortCircuitBytesRead()
- getZeroCopyBytesRead
  
  public static long getZeroCopyBytesRead()
- close
  
  public void close()
  
  CloseClose stream(s) if necessary.
  
  Specified by:
  
  close in interface AutoCloseable
  
  Specified by:
  
  close in interface Closeable
- getHfs
  
  public HFileSystem getHfs()
- unbuffer
  
  public void unbuffer()
  
  This will free sockets and file descriptors held by the stream only when the stream implements org.apache.hadoop.fs.CanUnbuffer. NOT THREAD SAFE. Must be called only when all the clients using this stream to read the blocks have finished reading. If by chance the stream is unbuffered and there are clients still holding this stream for read then on next client read request a new socket will be opened by Datanode without client knowing about it and will serve its read request. Note: If this socket is idle for some time then the DataNode will close the socket and the socket will move into CLOSE_WAIT state and on the next client request on this stream, the current socket will be closed and a new socket will be opened to serve the requests.
- getReaderPath
  
  public org.apache.hadoop.fs.Path getReaderPath()
- setShouldUseHBaseChecksum
  
  void setShouldUseHBaseChecksum()

Class FSDataInputStreamWrapper

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

hfs

path

link

doCloseStreams

dropBehind

readahead

stream

streamNoFsChecksum

streamNoFsChecksumFirstCreateLock

useHBaseChecksumConfigured

useHBaseChecksum

hbaseChecksumOffCount

readStatistics

readerPath

Constructor Details

FSDataInputStreamWrapper

FSDataInputStreamWrapper

FSDataInputStreamWrapper

FSDataInputStreamWrapper

FSDataInputStreamWrapper

FSDataInputStreamWrapper

Method Details

setStreamOptions

prepareForBlockReader

shouldUseHBaseChecksum

getStream

fallbackToFsChecksum

checksumOk

updateInputStreamStatistics

getTotalBytesRead

getLocalBytesRead

getShortCircuitBytesRead

getZeroCopyBytesRead

close

getHfs

unbuffer

getReaderPath

setShouldUseHBaseChecksum