SequenceIdAccounting (Apache HBase 2.4.0 API)

java.lang.Object
- org.apache.hadoop.hbase.regionserver.wal.SequenceIdAccounting

```
@InterfaceAudience.Private
class SequenceIdAccounting
extends Object
```
Accounting of sequence ids per region and then by column family. So we can keep our accounting current, call startCacheFlush and then finishedCacheFlush or abortCacheFlush so this instance can keep abreast of the state of sequence id persistence. Also call update per append.
For the implementation, we assume that all the encodedRegionName passed in are gotten by RegionInfo.getEncodedNameAsBytes(). So it is safe to use it as a hash key. And for family name, we use ImmutableByteArray as key. This is because hash based map is much faster than RBTree or CSLM and here we are on the critical write path. See HBASE-16278 for more details.

Field Summary

Fields
Modifier and Type	Field and Description
`private Map<byte[],Map<ImmutableByteArray,Long>>`	`flushingSequenceIds` Map of encoded region names and family names to their lowest or OLDEST sequence/edit id currently being flushed out to hfiles.
`private Map<byte[],Long>`	`highestSequenceIds` Map of region encoded names to the latest/highest region sequence id.
`private static org.slf4j.Logger`	`LOG`
`private ConcurrentMap<byte[],ConcurrentMap<ImmutableByteArray,Long>>`	`lowestUnflushedSequenceIds` Map of encoded region names and family names to their OLDEST -- i.e.
`private Object`	`tieLock` This lock ties all operations on `flushingSequenceIds` and `lowestUnflushedSequenceIds` Maps.

Constructor Summary

Constructors
Constructor and Description

SequenceIdAccounting()

Constructors
Constructor and Description
`SequenceIdAccounting()`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`(package private) void`	`abortCacheFlush(byte[] encodedRegionName)`
`(package private) boolean`	`areAllLower(Map<byte[],Long> sequenceids)` See if passed `sequenceids` are lower -- i.e.
`(package private) void`	`completeCacheFlush(byte[] encodedRegionName, long maxFlushedSeqId)`
`(package private) Map<byte[],List<byte[]>>`	`findLower(Map<byte[],Long> sequenceids)` Iterates over the given Map and compares sequence ids with corresponding entries in `lowestUnflushedSequenceIds`.
`private <T extends Map<?,Long>> Map<byte[],Long>`	`flattenToLowestSequenceId(Map<byte[],T> src)`
`(package private) long`	`getLowestSequenceId(byte[] encodedRegionName)` Returns the lowest unflushed sequence id for the region.
`(package private) long`	`getLowestSequenceId(byte[] encodedRegionName, byte[] familyName)`
`private static long`	`getLowestSequenceId(Map<?,Long> sequenceids)`
`(package private) ConcurrentMap<ImmutableByteArray,Long>`	`getOrCreateLowestSequenceIds(byte[] encodedRegionName)`
`(package private) void`	`onRegionClose(byte[] encodedRegionName)` Clear all the records of the given region as it is going to be closed.
`(package private) Map<byte[],Long>`	`resetHighest()` Reset the accounting of highest sequenceid by regionname.
`(package private) Long`	`startCacheFlush(byte[] encodedRegionName, Map<byte[],Long> familyToSeq)`
`(package private) Long`	`startCacheFlush(byte[] encodedRegionName, Set<byte[]> families)`
`(package private) void`	`update(byte[] encodedRegionName, Set<byte[]> families, long sequenceid, boolean lowest)` We've been passed a new sequenceid for the region.
`(package private) void`	`updateStore(byte[] encodedRegionName, byte[] familyName, Long sequenceId, boolean onlyIfGreater)` Update the store sequence id, e.g., upon executing in-memory compaction

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
private static final org.slf4j.Logger LOG
```
  - tieLock
```
private final Object tieLock
```
    This lock ties all operations on flushingSequenceIds and lowestUnflushedSequenceIds Maps. lowestUnflushedSequenceIds has the lowest outstanding sequence ids EXCEPT when flushing. When we flush, the current lowest set for the region/column family are moved (atomically because of this lock) to flushingSequenceIds.
    The two Maps are tied by this locking object EXCEPT when we go to update the lowest entry; see lowestUnflushedSequenceIds. In here is a putIfAbsent call on lowestUnflushedSequenceIds. In this latter case, we will add this lowest sequence id if we find that there is no entry for the current column family. There will be no entry only if we just came up OR we have moved aside current set of lowest sequence ids because the current set are being flushed (by putting them into flushingSequenceIds). This is how we pick up the next 'lowest' sequence id per region per column family to be used figuring what is in the next flush.
  - lowestUnflushedSequenceIds
```
private final ConcurrentMap<byte[],ConcurrentMap<ImmutableByteArray,Long>> lowestUnflushedSequenceIds
```
    Map of encoded region names and family names to their OLDEST -- i.e. their first, the longest-lived, their 'earliest', the 'lowest' -- sequence id.
    When we flush, the current lowest sequence ids get cleared and added to flushingSequenceIds. The next append that comes in, is then added here to lowestUnflushedSequenceIds as the next lowest sequenceid.
    If flush fails, currently server is aborted so no need to restore previous sequence ids.
    Needs to be concurrent Maps because we use putIfAbsent updating oldest.
  - flushingSequenceIds
```
private final Map<byte[],Map<ImmutableByteArray,Long>> flushingSequenceIds
```
    Map of encoded region names and family names to their lowest or OLDEST sequence/edit id currently being flushed out to hfiles. Entries are moved here from lowestUnflushedSequenceIds while the lock tieLock is held (so movement between the Maps is atomic).
  - highestSequenceIds
```
private Map<byte[],Long> highestSequenceIds
```
    Map of region encoded names to the latest/highest region sequence id. Updated on each call to append.
    
    This map uses byte[] as the key, and uses reference equality. It works in our use case as we use RegionInfo.getEncodedNameAsBytes() as keys. For a given region, it always returns the same array.
- Constructor Detail
  - SequenceIdAccounting
```
SequenceIdAccounting()
```
- Method Detail
  - getLowestSequenceId
```
long getLowestSequenceId(byte[] encodedRegionName)
```
    Returns the lowest unflushed sequence id for the region.
    
    Returns:
    
    Lowest outstanding unflushed sequenceid for encodedRegionName. Will return HConstants.NO_SEQNUM when none.
  - getLowestSequenceId
```
long getLowestSequenceId(byte[] encodedRegionName,
                         byte[] familyName)
```
    Returns:
    
    Lowest outstanding unflushed sequenceid for encodedRegionname and familyName. Returned sequenceid may be for an edit currently being flushed.
  - resetHighest
```
Map<byte[],Long> resetHighest()
```
    Reset the accounting of highest sequenceid by regionname.
    
    Returns:
    
    Return the previous accounting Map of regions to the last sequence id written into each.
  - update
```
void update(byte[] encodedRegionName,
            Set<byte[]> families,
            long sequenceid,
            boolean lowest)
```
    We've been passed a new sequenceid for the region. Set it as highest seen for this region and if we are to record oldest, or lowest sequenceids, save it as oldest seen if nothing currently older.
    
    Parameters:
    
    encodedRegionName -
    
    families -
    
    sequenceid -
    
    lowest - Whether to keep running account of oldest sequence id.
  - onRegionClose
```
void onRegionClose(byte[] encodedRegionName)
```
    Clear all the records of the given region as it is going to be closed.
    We will call this once we get the region close marker. We need this because that, if we use Durability.ASYNC_WAL, after calling startCacheFlush, we may still get some ongoing wal entries that has not been processed yet, this will lead to orphan records in the lowestUnflushedSequenceIds and then cause too many WAL files.
    See HBASE-23157 for more details.
  - updateStore
```
void updateStore(byte[] encodedRegionName,
                 byte[] familyName,
                 Long sequenceId,
                 boolean onlyIfGreater)
```
    Update the store sequence id, e.g., upon executing in-memory compaction
  - getOrCreateLowestSequenceIds
```
ConcurrentMap<ImmutableByteArray,Long> getOrCreateLowestSequenceIds(byte[] encodedRegionName)
```
  - getLowestSequenceId
```
private static long getLowestSequenceId(Map<?,Long> sequenceids)
```
    Parameters:
    
    sequenceids - Map to search for lowest value.
    
    Returns:
    
    Lowest value found in sequenceids.
  - flattenToLowestSequenceId
```
private <T extends Map<?,Long>> Map<byte[],Long> flattenToLowestSequenceId(Map<byte[],T> src)
```
    Parameters:
    
    src -
    
    Returns:
    
    New Map that has same keys as src but instead of a Map for a value, it instead has found the smallest sequence id and it returns that as the value instead.
  - startCacheFlush
```
Long startCacheFlush(byte[] encodedRegionName,
                     Set<byte[]> families)
```
    Parameters:
    
    encodedRegionName - Region to flush.
    
    families - Families to flush. May be a subset of all families in the region.
    
    Returns:
    
    Returns HConstants.NO_SEQNUM if we are flushing the whole region OR if we are flushing a subset of all families but there are no edits in those families not being flushed; in other words, this is effectively same as a flush of all of the region though we were passed a subset of regions. Otherwise, it returns the sequence id of the oldest/lowest outstanding edit.
  - startCacheFlush
```
Long startCacheFlush(byte[] encodedRegionName,
                     Map<byte[],Long> familyToSeq)
```
  - completeCacheFlush
```
void completeCacheFlush(byte[] encodedRegionName,
                        long maxFlushedSeqId)
```
  - abortCacheFlush
```
void abortCacheFlush(byte[] encodedRegionName)
```
  - areAllLower
```
boolean areAllLower(Map<byte[],Long> sequenceids)
```
    See if passed sequenceids are lower -- i.e. earlier -- than any outstanding sequenceids, sequenceids we are holding on to in this accounting instance.
    
    Parameters:
    
    sequenceids - Keyed by encoded region name. Cannot be null (doesn't make sense for it to be null).
    
    Returns:
    
    true if all sequenceids are lower, older than, the old sequenceids in this instance.
  - findLower
```
Map<byte[],List<byte[]>> findLower(Map<byte[],Long> sequenceids)
```
    Iterates over the given Map and compares sequence ids with corresponding entries in lowestUnflushedSequenceIds. If a region in lowestUnflushedSequenceIds has a sequence id less than that passed in sequenceids then return it.
    
    Parameters:
    
    sequenceids - Sequenceids keyed by encoded region name.
    
    Returns:
    
    stores of regions found in this instance with sequence ids less than those passed in.

Class SequenceIdAccounting

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LOG

tieLock

lowestUnflushedSequenceIds

flushingSequenceIds

highestSequenceIds

Constructor Detail

SequenceIdAccounting

Method Detail

getLowestSequenceId

getLowestSequenceId

resetHighest

update

onRegionClose

updateStore

getOrCreateLowestSequenceIds

getLowestSequenceId

flattenToLowestSequenceId

startCacheFlush

startCacheFlush

completeCacheFlush

abortCacheFlush

areAllLower

findLower