Class LruAdaptiveBlockCache

All Implemented Interfaces:
Iterable<CachedBlock>, HeapSize, BlockCache, FirstLevelBlockCache, ResizableBlockCache

@Private public class LruAdaptiveBlockCache extends Object implements FirstLevelBlockCache
This realisation improve performance of classical LRU cache up to 3 times via reduce GC job.

The classical block cache implementation that is memory-aware using HeapSize, memory-bound using an LRU eviction algorithm, and concurrent: backed by a ConcurrentHashMap and with a non-blocking eviction thread giving constant-time cacheBlock(,, boolean) and getBlock(, boolean, boolean, boolean) operations.

Contains three levels of block priority to allow for scan-resistance and in-memory families ColumnFamilyDescriptorBuilder.setInMemory(boolean) (An in-memory column family is a column family that should be served from memory if possible): single-access, multiple-accesses, and in-memory priority. A block is added with an in-memory priority flag if ColumnFamilyDescriptor.isInMemory(), otherwise a block becomes a single access priority the first time it is read into this block cache. If a block is accessed again while in cache, it is marked as a multiple access priority block. This delineation of blocks is used to prevent scans from thrashing the cache adding a least-frequently-used element to the eviction algorithm.

Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each priority will retain close to its maximum size, however, if any priority is not using its entire chunk the others are able to grow beyond their chunk size.

Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The block size is not especially important as this cache is fully dynamic in its sizing of blocks. It is only used for pre-allocating data structures and in initial heap estimation of the map.

The detailed constructor defines the sizes for the three priorities (they should total to the maximum size defined). It also sets the levels that trigger and control the eviction thread.

The acceptable size is the cache size level which triggers the eviction process to start. It evicts enough blocks to get the size below the minimum size specified.

Eviction happens in a separate thread and involves a single full-scan of the map. It determines how many bytes must be freed to reach the minimum size, and then while scanning determines the fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative sizes and usage.

Adaptive LRU cache lets speed up performance while we are reading much more data than can fit into BlockCache and it is the cause of a high rate of evictions. This in turn leads to heavy Garbage Collector works. So a lot of blocks put into BlockCache but never read, but spending a lot of CPU resources for cleaning. We could avoid this situation via parameters:

hbase.lru.cache.heavy.eviction.count.limit - set how many times we have to run the eviction process that starts to avoid putting data to BlockCache. By default it is 0 and it meats the feature will start at the beginning. But if we have some times short reading the same data and some times long-term reading - we can divide it by this parameter. For example we know that our short reading used to be about 1 minutes, then we have to set the parameter about 10 and it will enable the feature only for long time massive reading (after ~100 seconds). So when we use short-reading and want all of them in the cache we will have it (except for eviction of course). When we use long-term heavy reading the feature will be enabled after some time and bring better performance.

hbase.lru.cache.heavy.eviction.mb.size.limit - set how many bytes in 10 seconds desirable putting into BlockCache (and evicted from it). The feature will try to reach this value and maintain it. Don't try to set it too small because it leads to premature exit from this mode. For powerful CPUs (about 20-40 physical cores) it could be about 400-500 MB. Average system (~10 cores) 200-300 MB. Some weak systems (2-5 cores) may be good with 50-100 MB. How it works: we set the limit and after each ~10 second calculate how many bytes were freed. Overhead = Freed Bytes Sum (MB) * 100 / Limit (MB) - 100; For example we set the limit = 500 and were evicted 2000 MB. Overhead is: 2000 * 100 / 500 - 100 = 300% The feature is going to reduce a percent caching data blocks and fit evicted bytes closer to 100% (500 MB). Some kind of an auto-scaling. If freed bytes less then the limit we have got negative overhead. For example if were freed 200 MB: 200 * 100 / 500 - 100 = -60% The feature will increase the percent of caching blocks. That leads to fit evicted bytes closer to 100% (500 MB). The current situation we can find out in the log of RegionServer: BlockCache evicted (MB): 0, overhead (%): -100, heavy eviction counter: 0, current caching DataBlock (%): 100 - means no eviction, 100% blocks is caching BlockCache evicted (MB): 2000, overhead (%): 300, heavy eviction counter: 1, current caching DataBlock (%): 97 - means eviction begin, reduce of caching blocks by 3%. It help to tune your system and find out what value is better set. Don't try to reach 0% overhead, it is impossible. Quite good 50-100% overhead, it prevents premature exit from this mode.

hbase.lru.cache.heavy.eviction.overhead.coefficient - set how fast we want to get the result. If we know that our reading is heavy for a long time, we don't want to wait and can increase the coefficient and get good performance sooner. But if we aren't sure we can do it slowly and it could prevent premature exit from this mode. So, when the coefficient is higher we can get better performance when heavy reading is stable. But when reading is changing we can adjust to it and set the coefficient to lower value. For example, we set the coefficient = 0.01. It means the overhead (see above) will be multiplied by 0.01 and the result is the value of reducing percent caching blocks. For example, if the overhead = 300% and the coefficient = 0.01, then percent of caching blocks will reduce by 3%. Similar logic when overhead has got negative value (overshooting). Maybe it is just short-term fluctuation and we will try to stay in this mode. It helps avoid premature exit during short-term fluctuation. Backpressure has simple logic: more overshooting - more caching blocks.

Find more information about improvement:

  • Field Details

  • Constructor Details

    • LruAdaptiveBlockCache

      public LruAdaptiveBlockCache(long maxSize, long blockSize)
      Default constructor. Specify maximum size and expected average block size (approximation is fine).

      All other factors will be calculated based on defaults specified in this class.

      maxSize - maximum size of cache, in bytes
      blockSize - approximate size of each block, in bytes
    • LruAdaptiveBlockCache

      public LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread)
      Constructor used for testing. Allows disabling of the eviction thread.
    • LruAdaptiveBlockCache

      public LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread, org.apache.hadoop.conf.Configuration conf)
    • LruAdaptiveBlockCache

      public LruAdaptiveBlockCache(long maxSize, long blockSize, org.apache.hadoop.conf.Configuration conf)
    • LruAdaptiveBlockCache

      public LruAdaptiveBlockCache(long maxSize, long blockSize, boolean evictionThread, int mapInitialSize, float mapLoadFactor, int mapConcurrencyLevel, float minFactor, float acceptableFactor, float singleFactor, float multiFactor, float memoryFactor, float hardLimitFactor, boolean forceInMemory, long maxBlockSize, int heavyEvictionCountLimit, long heavyEvictionMbSizeLimit, float heavyEvictionOverheadCoefficient)
      Configurable constructor. Use this constructor if not using defaults.
      maxSize - maximum size of this cache, in bytes
      blockSize - expected average size of blocks, in bytes
      evictionThread - whether to run evictions in a bg thread or not
      mapInitialSize - initial size of backing ConcurrentHashMap
      mapLoadFactor - initial load factor of backing ConcurrentHashMap
      mapConcurrencyLevel - initial concurrency factor for backing CHM
      minFactor - percentage of total size that eviction will evict until
      acceptableFactor - percentage of total size that triggers eviction
      singleFactor - percentage of total size for single-access blocks
      multiFactor - percentage of total size for multiple-access blocks
      memoryFactor - percentage of total size for in-memory blocks
      hardLimitFactor - hard capacity limit
      forceInMemory - in-memory hfile's data block has higher priority when evicting
      maxBlockSize - maximum block size for caching
      heavyEvictionCountLimit - when starts AdaptiveLRU algoritm work
      heavyEvictionMbSizeLimit - how many bytes desirable putting into BlockCache
      heavyEvictionOverheadCoefficient - how aggressive AdaptiveLRU will reduce GC
  • Method Details

    • setVictimCache

      public void setVictimCache(BlockCache victimCache)
      Description copied from interface: FirstLevelBlockCache
      Specifies the secondary cache. An entry that is evicted from this cache due to a size constraint will be inserted into the victim cache.
      Specified by:
      setVictimCache in interface FirstLevelBlockCache
      victimCache - the second level cache
    • setMaxSize

      public void setMaxSize(long maxSize)
      Description copied from interface: ResizableBlockCache
      Sets the max heap size that can be used by the BlockCache.
      Specified by:
      setMaxSize in interface ResizableBlockCache
      maxSize - The max heap size.
    • getCacheDataBlockPercent

    • asReferencedHeapBlock

      The block cached in LruAdaptiveBlockCache will always be an heap block: on the one side, the heap access will be more faster then off-heap, the small index block or meta block cached in CombinedBlockCache will benefit a lot. on other side, the LruAdaptiveBlockCache size is always calculated based on the total heap size, if caching an off-heap block in LruAdaptiveBlockCache, the heap size will be messed up. Here we will clone the block into an heap block if it's an off-heap block, otherwise just use the original block. The key point is maintain the refCnt of the block (HBASE-22127):
      1. if cache the cloned heap block, its refCnt is an totally new one, it's easy to handle;
      2. if cache the original heap block, we're sure that it won't be tracked in ByteBuffAllocator's reservoir, if both RPC and LruAdaptiveBlockCache release the block, then it can be garbage collected by JVM, so need a retain here.
      buf - the original block
      an block with an heap memory backend.
    • cacheBlock

      public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory)
      Cache the block with the specified name and buffer.

      It is assumed this will NOT be called on an already cached block. In rare cases (HBASE-8547) this can happen, for which we compare the buffer contents.

      Specified by:
      cacheBlock in interface BlockCache
      cacheKey - block's cache key
      buf - block buffer
      inMemory - if block is in-memory
    • assertCounterSanity

      private static void assertCounterSanity(long mapSize, long counterVal)
      Sanity-checking for parity between actual block cache content and metrics. Intended only for use with TRACE level logging and -ea JVM.
    • cacheBlock

      public void cacheBlock(BlockCacheKey cacheKey, Cacheable buf)
      Cache the block with the specified name and buffer.

      TODO after HBASE-22005, we may cache an block which allocated from off-heap, but our LRU cache sizing is based on heap size, so we should handle this in HBASE-22127. It will introduce an switch whether make the LRU on-heap or not, if so we may need copy the memory to on-heap, otherwise the caching size is based on off-heap.

      Specified by:
      cacheBlock in interface BlockCache
      cacheKey - block's cache key
      buf - block buffer
    • updateSizeMetrics

      private long updateSizeMetrics(LruCachedBlock cb, boolean evict)
      Helper function that updates the local size counter and also updates any per-cf or per-blocktype metrics it can discern from given LruCachedBlock
    • getBlock

      public Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat, boolean updateCacheMetrics)
      Get the buffer of the block with the specified name.
      Specified by:
      getBlock in interface BlockCache
      cacheKey - block's cache key
      caching - true if the caller caches blocks on cache misses
      repeat - Whether this is a repeat lookup for the same block (used to avoid double counting cache misses when doing double-check locking)
      updateCacheMetrics - Whether to update cache metrics or not
      buffer of specified cache key, or null if not in cache
    • containsBlock

      public boolean containsBlock(BlockCacheKey cacheKey)
      Whether the cache contains block with specified cacheKey
      Specified by:
      containsBlock in interface FirstLevelBlockCache
      cacheKey - cache key for the block
      true if contains the block
    • evictBlock

      public boolean evictBlock(BlockCacheKey cacheKey)
      Description copied from interface: BlockCache
      Evict block from cache.
      Specified by:
      evictBlock in interface BlockCache
      cacheKey - Block to evict
      true if block existed and was evicted, false if not
    • evictBlocksByHfileName

      public int evictBlocksByHfileName(String hfileName)
      Evicts all blocks for a specific HFile. This is an expensive operation implemented as a linear-time search through all blocks in the cache. Ideally this should be a search in a log-access-time map.

      This is used for evict-on-close to remove all blocks of a specific HFile.

      Specified by:
      evictBlocksByHfileName in interface BlockCache
      the number of blocks evicted
    • evictBlock

      protected long evictBlock(LruCachedBlock block, boolean evictedByEvictionProcess)
      Evict the block, and it will be cached by the victim handler if exists && block may be read again later
      evictedByEvictionProcess - true if the given block is evicted by EvictionThread
      the heap size of evicted block
    • runEviction

      private void runEviction()
      Multi-threaded call to run the eviction process.
    • isEvictionInProgress

    • getOverhead

      long getOverhead()
    • evict

      long evict()
      Eviction method. Evict items in order of use, allowing delete items which haven't been used for the longest amount of time.
      how many bytes were freed
    • toString

      public String toString()
      toString in class Object
    • getMaxSize

      public long getMaxSize()
      Get the maximum size of this cache.
      Specified by:
      getMaxSize in interface BlockCache
      max size in bytes
    • getCurrentSize

      public long getCurrentSize()
      Description copied from interface: BlockCache
      Returns the occupied size of the block cache, in bytes.
      Specified by:
      getCurrentSize in interface BlockCache
      occupied space in cache, in bytes
    • getCurrentDataSize

      public long getCurrentDataSize()
      Description copied from interface: BlockCache
      Returns the occupied size of data blocks, in bytes.
      Specified by:
      getCurrentDataSize in interface BlockCache
      occupied space in cache, in bytes
    • getFreeSize

      public long getFreeSize()
      Description copied from interface: BlockCache
      Returns the free size of the block cache, in bytes.
      Specified by:
      getFreeSize in interface BlockCache
      free space in cache, in bytes
    • size

      public long size()
      Description copied from interface: BlockCache
      Returns the total size of the block cache, in bytes.
      Specified by:
      size in interface BlockCache
      size of cache, in bytes
    • getBlockCount

      public long getBlockCount()
      Description copied from interface: BlockCache
      Returns the number of blocks currently cached in the block cache.
      Specified by:
      getBlockCount in interface BlockCache
      number of blocks in the cache
    • getDataBlockCount

      public long getDataBlockCount()
      Description copied from interface: BlockCache
      Returns the number of data blocks currently cached in the block cache.
      Specified by:
      getDataBlockCount in interface BlockCache
      number of blocks in the cache
    • getEvictionThread

    • logStats

      public void logStats()
    • getStats

      public CacheStats getStats()
      Get counter statistics for this cache.

      Includes: total accesses, hits, misses, evicted blocks, and runs of the eviction processes.

      Specified by:
      getStats in interface BlockCache
    • heapSize

      public long heapSize()
      Description copied from interface: HeapSize
      Return the approximate 'exclusive deep size' of implementing object. Includes count of payload and hosting object sizings.
      Specified by:
      heapSize in interface HeapSize
    • calculateOverhead

      private static long calculateOverhead(long maxSize, long blockSize, int concurrency)
    • iterator

      Description copied from interface: BlockCache
      Returns Iterator over the blocks in the cache.
      Specified by:
      iterator in interface BlockCache
      Specified by:
      iterator in interface Iterable<CachedBlock>
    • acceptableSize

    • minSize

      private long minSize()
    • singleSize

      private long singleSize()
    • multiSize

      private long multiSize()
    • memorySize

      private long memorySize()
    • shutdown

      public void shutdown()
      Description copied from interface: BlockCache
      Shutdown the cache.
      Specified by:
      shutdown in interface BlockCache
    • clearCache

      public void clearCache()
      Clears the cache. Used in tests.
    • getEncodingCountsForTest

    • getMapForTests

    • getBlockCaches

      Description copied from interface: BlockCache
      Returns The list of sub blockcaches that make up this one; returns null if no sub caches.
      Specified by:
      getBlockCaches in interface BlockCache