Class InputStreamBlockDistribution
java.lang.Object
org.apache.hadoop.hbase.regionserver.InputStreamBlockDistribution
Computes the HDFSBlockDistribution for a file based on the underlying located blocks for an
HdfsDataInputStream reading that file. The backing DFSInputStream.getAllBlocks involves
allocating an array of numBlocks size per call. It may also involve calling the namenode, if the
DFSInputStream has not fetched all the blocks yet. In order to avoid allocation pressure, we
cache the computed distribution for a configurable period of time.
This class only gets instantiated for the first FSDataInputStream of each StoreFile (i.e.
the one backing HStoreFile.initialReader
). It's then used to dynamically update the value
returned by HStoreFile.getHDFSBlockDistribution()
.
Once the backing FSDataInputStream is closed, we should not expect the distribution result to
change anymore. This is ok becuase the initialReader's InputStream is only closed when the
StoreFile itself is closed, at which point nothing will be querying getHDFSBlockDistribution
anymore. If/When the StoreFile is reopened, a new InputStreamBlockDistribution
will be
created for the new initialReader.
-
Field Summary
Modifier and TypeFieldDescriptionprivate final int
private static final int
private static final boolean
private final StoreFileInfo
private static final String
private static final String
private HDFSBlocksDistribution
private long
private static final org.slf4j.Logger
private final org.apache.hadoop.fs.FSDataInputStream
private boolean
-
Constructor Summary
ConstructorDescriptionInputStreamBlockDistribution
(org.apache.hadoop.fs.FSDataInputStream stream, StoreFileInfo fileInfo) This should only be called for the first FSDataInputStream of a StoreFile, inHStoreFile.open()
. -
Method Summary
Modifier and TypeMethodDescriptionprivate void
(package private) long
For tests only, returns the configured cache periodGet the HDFSBlocksDistribution derived from the StoreFile input stream, re-computing if cache is expired.static boolean
isEnabled
(org.apache.hadoop.conf.Configuration conf) True if we should derive StoreFile HDFSBlockDistribution from the underlying input stream(package private) boolean
For tests only, returns whether the passed stream is supported(package private) void
setLastCachedAt
(long timestamp) For tests only, sets lastCachedAt so we can force a refresh
-
Field Details
-
LOG
-
HBASE_LOCALITY_INPUTSTREAM_DERIVE_ENABLED
- See Also:
-
DEFAULT_HBASE_LOCALITY_INPUTSTREAM_DERIVE_ENABLED
- See Also:
-
HBASE_LOCALITY_INPUTSTREAM_DERIVE_CACHE_PERIOD
- See Also:
-
DEFAULT_HBASE_LOCALITY_INPUTSTREAM_DERIVE_CACHE_PERIOD
- See Also:
-
stream
-
fileInfo
-
cachePeriodMs
-
hdfsBlocksDistribution
-
lastCachedAt
-
streamUnsupported
-
-
Constructor Details
-
InputStreamBlockDistribution
public InputStreamBlockDistribution(org.apache.hadoop.fs.FSDataInputStream stream, StoreFileInfo fileInfo) This should only be called for the first FSDataInputStream of a StoreFile, inHStoreFile.open()
.- Parameters:
stream
- the input stream to derive locality fromfileInfo
- the StoreFileInfo for the related store file- See Also:
-
-
Method Details
-
isEnabled
True if we should derive StoreFile HDFSBlockDistribution from the underlying input stream -
getHDFSBlockDistribution
Get the HDFSBlocksDistribution derived from the StoreFile input stream, re-computing if cache is expired. -
computeBlockDistribution
- Throws:
IOException
-
setLastCachedAt
For tests only, sets lastCachedAt so we can force a refresh -
getCachePeriodMs
long getCachePeriodMs()For tests only, returns the configured cache period -
isStreamUnsupported
boolean isStreamUnsupported()For tests only, returns whether the passed stream is supported
-