@InterfaceAudience.Private public class LruAdaptiveBlockCache extends Object implements FirstLevelBlockCache
HeapSize
,
memory-bound using an LRU eviction algorithm, and concurrent: backed by a
ConcurrentHashMap
and with a non-blocking eviction thread giving constant-time
cacheBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, org.apache.hadoop.hbase.io.hfile.Cacheable, boolean)
and getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean, boolean, boolean)
operations.
Contains three levels of block priority to allow for scan-resistance and in-memory families
ColumnFamilyDescriptorBuilder.setInMemory(boolean)
(An
in-memory column family is a column family that should be served from memory if possible):
single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory
priority flag if ColumnFamilyDescriptor.isInMemory()
,
otherwise a block becomes a single access priority the first time it is read into this block
cache. If a block is accessed again while in cache, it is marked as a multiple access priority
block. This delineation of blocks is used to prevent scans from thrashing the cache adding a
least-frequently-used element to the eviction algorithm.
Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each
priority will retain close to its maximum size, however, if any priority is not using its entire
chunk the others are able to grow beyond their chunk size.
Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The
block size is not especially important as this cache is fully dynamic in its sizing of blocks. It
is only used for pre-allocating data structures and in initial heap estimation of the map.
The detailed constructor defines the sizes for the three priorities (they should total to the
maximum size
defined). It also sets the levels that trigger and control the eviction
thread.
The acceptable size
is the cache size level which triggers the eviction process to
start. It evicts enough blocks to get the size below the minimum size specified.
Eviction happens in a separate thread and involves a single full-scan of the map. It determines
how many bytes must be freed to reach the minimum size, and then while scanning determines the
fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times
bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative
sizes and usage.
Adaptive LRU cache lets speed up performance while we are reading much more data than can fit
into BlockCache and it is the cause of a high rate of evictions. This in turn leads to heavy
Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a
lot of CPU resources for cleaning. We could avoid this situation via parameters:
hbase.lru.cache.heavy.eviction.count.limit - set how many times we have to run the
eviction process that starts to avoid putting data to BlockCache. By default it is 0 and it meats
the feature will start at the beginning. But if we have some times short reading the same data
and some times long-term reading - we can divide it by this parameter. For example we know that
our short reading used to be about 1 minutes, then we have to set the parameter about 10 and it
will enable the feature only for long time massive reading (after ~100 seconds). So when we use
short-reading and want all of them in the cache we will have it (except for eviction of course).
When we use long-term heavy reading the feature will be enabled after some time and bring better
performance.
hbase.lru.cache.heavy.eviction.mb.size.limit - set how many bytes in 10 seconds desirable
putting into BlockCache (and evicted from it). The feature will try to reach this value and
maintain it. Don't try to set it too small because it leads to premature exit from this mode. For
powerful CPUs (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10
cores) 200-300 MB. Some weak systems (2-5 cores) may be good with 50-100 MB. How it works: we set
the limit and after each ~10 second calculate how many bytes were freed. Overhead = Freed Bytes
Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB.
Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data
blocks and fit evicted bytes closer to 100% (500 MB). Some kind of an auto-scaling. If freed
bytes less then the limit we have got negative overhead. For example if were freed 200 MB: 200 *
100 / 500 - 100 = -60% The feature will increase the percent of caching blocks. That leads to fit
evicted bytes closer to 100% (500 MB). The current situation we can find out in the log of
RegionServer: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current
caching DataBlock (%): 100 - means no eviction, 100% blocks is caching BlockCache evicted (MB):
2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97 - means
eviction begin, reduce of caching blocks by 3%. It help to tune your system and find out what
value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 50-100%
overhead, it prevents premature exit from this mode.
hbase.lru.cache.heavy.eviction.overhead.coefficient - set how fast we want to get the
result. If we know that our reading is heavy for a long time, we don't want to wait and can
increase the coefficient and get good performance sooner. But if we aren't sure we can do it
slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we
can get better performance when heavy reading is stable. But when reading is changing we can
adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01.
It means the overhead (see above) will be multiplied by 0.01 and the result is the value of
reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01,
then percent of caching blocks will reduce by 3%. Similar logic when overhead has got negative
value (overshooting). Maybe it is just short-term fluctuation and we will try to stay in this
mode. It helps avoid premature exit during short-term fluctuation. Backpressure has simple logic:
more overshooting - more caching blocks.
Find more information about improvement: https://issues.apache.org/jira/browse/HBASE-23887Modifier and Type | Class and Description |
---|---|
private class |
LruAdaptiveBlockCache.BlockBucket
Used to group blocks into priority buckets.
|
(package private) static class |
LruAdaptiveBlockCache.EvictionThread |
(package private) static class |
LruAdaptiveBlockCache.StatisticsThread |
Modifier and Type | Field and Description |
---|---|
private float |
acceptableFactor
Acceptable size of cache (no evictions if size < acceptable)
|
private long |
blockSize
Approximate block size
|
static long |
CACHE_FIXED_OVERHEAD |
private int |
cacheDataBlockPercent
Percent of cached data blocks
|
private AtomicLong |
count
Cache access count (sequential ID)
|
private LongAdder |
dataBlockElements
Current number of cached data block elements
|
private LongAdder |
dataBlockSize
Current size of data blocks
|
(package private) static float |
DEFAULT_ACCEPTABLE_FACTOR |
(package private) static int |
DEFAULT_CONCURRENCY_LEVEL |
private static float |
DEFAULT_HARD_CAPACITY_LIMIT_FACTOR |
private static boolean |
DEFAULT_IN_MEMORY_FORCE_MODE |
(package private) static float |
DEFAULT_LOAD_FACTOR |
private static int |
DEFAULT_LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT |
private static long |
DEFAULT_LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT |
private static float |
DEFAULT_LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT |
private static long |
DEFAULT_MAX_BLOCK_SIZE |
private static float |
DEFAULT_MEMORY_FACTOR |
private static float |
DEFAULT_MIN_FACTOR |
private static float |
DEFAULT_MULTI_FACTOR |
private static float |
DEFAULT_SINGLE_FACTOR |
private AtomicLong |
elements
Current number of cached elements
|
private boolean |
evictionInProgress
Volatile boolean to track if we are in an eviction process or not
|
private ReentrantLock |
evictionLock
Eviction lock (locked when eviction in process)
|
private LruAdaptiveBlockCache.EvictionThread |
evictionThread
Eviction thread
|
private boolean |
forceInMemory
Whether in-memory hfile's data block has higher priority when evicting
|
private float |
hardCapacityLimitFactor
hard capacity limit
|
private int |
heavyEvictionCountLimit
Limit of count eviction process when start to avoid to cache blocks
|
private long |
heavyEvictionMbSizeLimit
Limit of volume eviction process when start to avoid to cache blocks
|
private float |
heavyEvictionOverheadCoefficient
Adjust auto-scaling via overhead of evition rate
|
private static org.slf4j.Logger |
LOG |
private static String |
LRU_ACCEPTABLE_FACTOR_CONFIG_NAME
Acceptable size of cache (no evictions if size < acceptable)
|
private static String |
LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT |
private static String |
LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT |
private static String |
LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT |
(package private) static String |
LRU_HARD_CAPACITY_LIMIT_FACTOR_CONFIG_NAME
Hard capacity limit of cache, will reject any put if size > this * acceptable
|
private static String |
LRU_IN_MEMORY_FORCE_MODE_CONFIG_NAME
Configuration key to force data-block always (except in-memory are too much) cached in memory
for in-memory hfile, unlike inMemory, which is a column-family configuration, inMemoryForceMode
is a cluster-wide configuration
|
private static String |
LRU_MAX_BLOCK_SIZE |
private static String |
LRU_MEMORY_PERCENTAGE_CONFIG_NAME |
private static String |
LRU_MIN_FACTOR_CONFIG_NAME
Percentage of total size that eviction will evict until; e.g.
|
private static String |
LRU_MULTI_PERCENTAGE_CONFIG_NAME |
private static String |
LRU_SINGLE_PERCENTAGE_CONFIG_NAME |
private ConcurrentHashMap<BlockCacheKey,LruCachedBlock> |
map
Defined the cache map as
ConcurrentHashMap here, because in
getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean, boolean, boolean) , we need to guarantee the atomicity of
map#computeIfPresent (key, func). |
private long |
maxBlockSize |
private long |
maxSize
Maximum allowable size of cache (block put if size > max, evict)
|
private float |
memoryFactor
In-memory bucket size
|
private float |
minFactor
Minimum threshold of cache (when evicting, evict until size < min)
|
private float |
multiFactor
Multiple access bucket size
|
private long |
overhead
Overhead of the structure itself
|
private ScheduledExecutorService |
scheduleThreadPool
Statistics thread schedule pool (for heavy debugging, could remove)
|
private float |
singleFactor
Single access bucket size
|
private AtomicLong |
size
Current size of cache
|
private static int |
STAT_THREAD_PERIOD |
private CacheStats |
stats
Cache statistics
|
private BlockCache |
victimHandler
Where to send victims (blocks evicted/missing from the cache).
|
Constructor and Description |
---|
LruAdaptiveBlockCache(long maxSize,
long blockSize)
Default constructor.
|
LruAdaptiveBlockCache(long maxSize,
long blockSize,
boolean evictionThread)
Constructor used for testing.
|
LruAdaptiveBlockCache(long maxSize,
long blockSize,
boolean evictionThread,
org.apache.hadoop.conf.Configuration conf) |
LruAdaptiveBlockCache(long maxSize,
long blockSize,
boolean evictionThread,
int mapInitialSize,
float mapLoadFactor,
int mapConcurrencyLevel,
float minFactor,
float acceptableFactor,
float singleFactor,
float multiFactor,
float memoryFactor,
float hardLimitFactor,
boolean forceInMemory,
long maxBlockSize,
int heavyEvictionCountLimit,
long heavyEvictionMbSizeLimit,
float heavyEvictionOverheadCoefficient)
Configurable constructor.
|
LruAdaptiveBlockCache(long maxSize,
long blockSize,
org.apache.hadoop.conf.Configuration conf) |
Modifier and Type | Method and Description |
---|---|
(package private) long |
acceptableSize() |
private Cacheable |
asReferencedHeapBlock(Cacheable buf)
The block cached in LruAdaptiveBlockCache will always be an heap block: on the one side, the
heap access will be more faster then off-heap, the small index block or meta block cached in
CombinedBlockCache will benefit a lot.
|
private static void |
assertCounterSanity(long mapSize,
long counterVal)
Sanity-checking for parity between actual block cache content and metrics.
|
void |
cacheBlock(BlockCacheKey cacheKey,
Cacheable buf)
Cache the block with the specified name and buffer.
|
void |
cacheBlock(BlockCacheKey cacheKey,
Cacheable buf,
boolean inMemory)
Cache the block with the specified name and buffer.
|
private static long |
calculateOverhead(long maxSize,
long blockSize,
int concurrency) |
void |
clearCache()
Clears the cache.
|
boolean |
containsBlock(BlockCacheKey cacheKey)
Whether the cache contains block with specified cacheKey
|
(package private) long |
evict()
Eviction method.
|
boolean |
evictBlock(BlockCacheKey cacheKey)
Evict block from cache.
|
protected long |
evictBlock(LruCachedBlock block,
boolean evictedByEvictionProcess)
Evict the block, and it will be cached by the victim handler if exists && block may be
read again later
|
int |
evictBlocksByHfileName(String hfileName)
Evicts all blocks for a specific HFile.
|
Cacheable |
getBlock(BlockCacheKey cacheKey,
boolean caching,
boolean repeat,
boolean updateCacheMetrics)
Get the buffer of the block with the specified name.
|
BlockCache[] |
getBlockCaches()
Returns The list of sub blockcaches that make up this one; returns null if no sub caches.
|
long |
getBlockCount()
Returns the number of blocks currently cached in the block cache.
|
int |
getCacheDataBlockPercent() |
long |
getCurrentDataSize()
Returns the occupied size of data blocks, in bytes.
|
long |
getCurrentSize()
Returns the occupied size of the block cache, in bytes.
|
long |
getDataBlockCount()
Returns the number of data blocks currently cached in the block cache.
|
Map<DataBlockEncoding,Integer> |
getEncodingCountsForTest() |
(package private) LruAdaptiveBlockCache.EvictionThread |
getEvictionThread() |
long |
getFreeSize()
Returns the free size of the block cache, in bytes.
|
(package private) Map<BlockCacheKey,LruCachedBlock> |
getMapForTests() |
long |
getMaxSize()
Get the maximum size of this cache.
|
(package private) long |
getOverhead() |
CacheStats |
getStats()
Get counter statistics for this cache.
|
long |
heapSize()
Return the approximate 'exclusive deep size' of implementing object.
|
(package private) boolean |
isEvictionInProgress() |
Iterator<CachedBlock> |
iterator()
Returns Iterator over the blocks in the cache.
|
void |
logStats() |
private long |
memorySize() |
private long |
minSize() |
private long |
multiSize() |
private void |
runEviction()
Multi-threaded call to run the eviction process.
|
void |
setMaxSize(long maxSize)
Sets the max heap size that can be used by the BlockCache.
|
void |
setVictimCache(BlockCache victimCache)
Specifies the secondary cache.
|
void |
shutdown()
Shutdown the cache.
|
private long |
singleSize() |
long |
size()
Returns the total size of the block cache, in bytes.
|
String |
toString() |
private long |
updateSizeMetrics(LruCachedBlock cb,
boolean evict)
Helper function that updates the local size counter and also updates any per-cf or
per-blocktype metrics it can discern from given
LruCachedBlock |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
isMetaBlock
forEach, spliterator
private static final org.slf4j.Logger LOG
private static final String LRU_MIN_FACTOR_CONFIG_NAME
private static final String LRU_ACCEPTABLE_FACTOR_CONFIG_NAME
static final String LRU_HARD_CAPACITY_LIMIT_FACTOR_CONFIG_NAME
private static final String LRU_SINGLE_PERCENTAGE_CONFIG_NAME
private static final String LRU_MULTI_PERCENTAGE_CONFIG_NAME
private static final String LRU_MEMORY_PERCENTAGE_CONFIG_NAME
private static final String LRU_IN_MEMORY_FORCE_MODE_CONFIG_NAME
static final float DEFAULT_LOAD_FACTOR
static final int DEFAULT_CONCURRENCY_LEVEL
private static final float DEFAULT_MIN_FACTOR
static final float DEFAULT_ACCEPTABLE_FACTOR
private static final float DEFAULT_SINGLE_FACTOR
private static final float DEFAULT_MULTI_FACTOR
private static final float DEFAULT_MEMORY_FACTOR
private static final float DEFAULT_HARD_CAPACITY_LIMIT_FACTOR
private static final boolean DEFAULT_IN_MEMORY_FORCE_MODE
private static final int STAT_THREAD_PERIOD
private static final String LRU_MAX_BLOCK_SIZE
private static final long DEFAULT_MAX_BLOCK_SIZE
private static final String LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT
private static final int DEFAULT_LRU_CACHE_HEAVY_EVICTION_COUNT_LIMIT
private static final String LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT
private static final long DEFAULT_LRU_CACHE_HEAVY_EVICTION_MB_SIZE_LIMIT
private static final String LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT
private static final float DEFAULT_LRU_CACHE_HEAVY_EVICTION_OVERHEAD_COEFFICIENT
private final transient ConcurrentHashMap<BlockCacheKey,LruCachedBlock> map
ConcurrentHashMap
here, because in
getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean, boolean, boolean)
, we need to guarantee the atomicity of
map#computeIfPresent (key, func). Besides, the func method must execute exactly once only when
the key is present and under the lock context, otherwise the reference count will be messed up.
Notice that the ConcurrentSkipListMap
can not guarantee that.private final transient ReentrantLock evictionLock
private final long maxBlockSize
private volatile boolean evictionInProgress
private final transient LruAdaptiveBlockCache.EvictionThread evictionThread
private final transient ScheduledExecutorService scheduleThreadPool
private final AtomicLong size
private final LongAdder dataBlockSize
private final AtomicLong elements
private final LongAdder dataBlockElements
private final AtomicLong count
private final float hardCapacityLimitFactor
private final CacheStats stats
private long maxSize
private final long blockSize
private final float acceptableFactor
private final float minFactor
private final float singleFactor
private final float multiFactor
private final float memoryFactor
private final long overhead
private boolean forceInMemory
private transient BlockCache victimHandler
private volatile int cacheDataBlockPercent
private final int heavyEvictionCountLimit
private final long heavyEvictionMbSizeLimit
private final float heavyEvictionOverheadCoefficient
public static final long CACHE_FIXED_OVERHEAD
public LruAdaptiveBlockCache(long maxSize, long blockSize)
All other factors will be calculated based on defaults specified in this class.
maxSize
- maximum size of cache, in bytesblockSize
- approximate size of each block, in bytespublic LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread)
public LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread, org.apache.hadoop.conf.Configuration conf)
public LruAdaptiveBlockCache(long maxSize, long blockSize, org.apache.hadoop.conf.Configuration conf)
public LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread, int mapInitialSize, float mapLoadFactor, int mapConcurrencyLevel, float minFactor, float acceptableFactor, float singleFactor, float multiFactor, float memoryFactor, float hardLimitFactor, boolean forceInMemory, long maxBlockSize, int heavyEvictionCountLimit, long heavyEvictionMbSizeLimit, float heavyEvictionOverheadCoefficient)
maxSize
- maximum size of this cache, in bytesblockSize
- expected average size of blocks, in bytesevictionThread
- whether to run evictions in a bg thread or notmapInitialSize
- initial size of backing ConcurrentHashMapmapLoadFactor
- initial load factor of backing ConcurrentHashMapmapConcurrencyLevel
- initial concurrency factor for backing CHMminFactor
- percentage of total size that eviction will evict untilacceptableFactor
- percentage of total size that triggers evictionsingleFactor
- percentage of total size for single-access blocksmultiFactor
- percentage of total size for multiple-access blocksmemoryFactor
- percentage of total size for in-memory blockshardLimitFactor
- hard capacity limitforceInMemory
- in-memory hfile's data block has higher priority when
evictingmaxBlockSize
- maximum block size for cachingheavyEvictionCountLimit
- when starts AdaptiveLRU algoritm workheavyEvictionMbSizeLimit
- how many bytes desirable putting into BlockCacheheavyEvictionOverheadCoefficient
- how aggressive AdaptiveLRU will reduce GCpublic void setVictimCache(BlockCache victimCache)
FirstLevelBlockCache
setVictimCache
in interface FirstLevelBlockCache
victimCache
- the second level cachepublic void setMaxSize(long maxSize)
ResizableBlockCache
setMaxSize
in interface ResizableBlockCache
maxSize
- The max heap size.public int getCacheDataBlockPercent()
private Cacheable asReferencedHeapBlock(Cacheable buf)
buf
- the original blockpublic void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory)
It is assumed this will NOT be called on an already cached block. In rare cases (HBASE-8547) this can happen, for which we compare the buffer contents.
cacheBlock
in interface BlockCache
cacheKey
- block's cache keybuf
- block bufferinMemory
- if block is in-memoryprivate static void assertCounterSanity(long mapSize, long counterVal)
public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf)
TODO after HBASE-22005, we may cache an block which allocated from off-heap, but our LRU cache sizing is based on heap size, so we should handle this in HBASE-22127. It will introduce an switch whether make the LRU on-heap or not, if so we may need copy the memory to on-heap, otherwise the caching size is based on off-heap.
cacheBlock
in interface BlockCache
cacheKey
- block's cache keybuf
- block bufferprivate long updateSizeMetrics(LruCachedBlock cb, boolean evict)
LruCachedBlock
public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat, boolean updateCacheMetrics)
getBlock
in interface BlockCache
cacheKey
- block's cache keycaching
- true if the caller caches blocks on cache missesrepeat
- Whether this is a repeat lookup for the same block (used to avoid
double counting cache misses when doing double-check locking)updateCacheMetrics
- Whether to update cache metrics or notpublic boolean containsBlock(BlockCacheKey cacheKey)
containsBlock
in interface FirstLevelBlockCache
cacheKey
- cache key for the blockpublic boolean evictBlock(BlockCacheKey cacheKey)
BlockCache
evictBlock
in interface BlockCache
cacheKey
- Block to evictpublic int evictBlocksByHfileName(String hfileName)
This is used for evict-on-close to remove all blocks of a specific HFile.
evictBlocksByHfileName
in interface BlockCache
protected long evictBlock(LruCachedBlock block, boolean evictedByEvictionProcess)
evictedByEvictionProcess
- true if the given block is evicted by EvictionThreadprivate void runEviction()
boolean isEvictionInProgress()
long getOverhead()
long evict()
public long getMaxSize()
getMaxSize
in interface BlockCache
public long getCurrentSize()
BlockCache
getCurrentSize
in interface BlockCache
public long getCurrentDataSize()
BlockCache
getCurrentDataSize
in interface BlockCache
public long getFreeSize()
BlockCache
getFreeSize
in interface BlockCache
public long size()
BlockCache
size
in interface BlockCache
public long getBlockCount()
BlockCache
getBlockCount
in interface BlockCache
public long getDataBlockCount()
BlockCache
getDataBlockCount
in interface BlockCache
LruAdaptiveBlockCache.EvictionThread getEvictionThread()
public void logStats()
public CacheStats getStats()
Includes: total accesses, hits, misses, evicted blocks, and runs of the eviction processes.
getStats
in interface BlockCache
public long heapSize()
HeapSize
private static long calculateOverhead(long maxSize, long blockSize, int concurrency)
public Iterator<CachedBlock> iterator()
BlockCache
iterator
in interface Iterable<CachedBlock>
iterator
in interface BlockCache
long acceptableSize()
private long minSize()
private long singleSize()
private long multiSize()
private long memorySize()
public void shutdown()
BlockCache
shutdown
in interface BlockCache
public void clearCache()
public Map<DataBlockEncoding,Integer> getEncodingCountsForTest()
Map<BlockCacheKey,LruCachedBlock> getMapForTests()
public BlockCache[] getBlockCaches()
BlockCache
getBlockCaches
in interface BlockCache
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.