Class SerialReplicationChecker
Helper class to determine whether we can push a given WAL entry without breaking the replication
order. The class is designed to per ReplicationSourceWALReader, so not thread safe.
We record all the open sequence number for a region in a special family in meta, which is called 'rep_barrier', so there will be a sequence of open sequence number (b1, b2, b3, ...). We call [bn, bn+1) a range, and it is obvious that a region will always be on the same RS within a range.
When split and merge, we will also record the parent for the generated region(s) in the special family in meta. And also, we will write an extra 'open sequence number' for the parent region(s), which is the max sequence id of the region plus one.
For each peer, we record the last pushed sequence id for each region. It is managed by the replication storage.
The algorithm works like this:
- Locate the sequence id we want to push in the barriers
- If it is before the first barrier, we are safe to push. This usually because we enable serial replication for this table after we create the table and write data into the table.
- In general, if the previous range is finished, then we are safe to push. The way to determine
whether a range is finish is straight-forward: check whether the last pushed sequence id is equal
to the end barrier of the range minus 1. There are several exceptions:
- If it is in the first range, we need to check whether there are parent regions. If so, we need to make sure that the data for parent regions have all been pushed.
- If it is in the last range, we need to check the region state. If state is OPENING, then we are not safe to push. This is because that, before we call reportRIT to master which update the open sequence number into meta table, we will write a open region event marker to WAL first, and its sequence id is greater than the newest open sequence number(which has not been updated to meta table yet so we do not know). For this scenario, the WAL entry for this open region event marker actually belongs to the range after the 'last' range, so we are not safe to push it. Otherwise the last pushed sequence id will be updated to this value and then we think the previous range has already been finished, but this is not true.
- Notice that the above two exceptions are not conflicts, since the first range can also be the last range if we only have one range.
And for performance reason, we do not want to check meta for every WAL entry, so we introduce two in memory maps. The idea is simple:
- If a range can be pushed, then put its end barrier into the
canPushUndermap. - Before accessing meta, first check the sequence id stored in the
canPushUndermap. If the sequence id of WAL entry is less the one stored incanPushUndermap, then we are safe to push.
- When an entry is able to push, then put its sequence id into the
pushedmap. - Check if the sequence id of WAL entry equals to the one stored in the
pushedmap plus one. If so, we are safe to push, and also update thepushedmap with the sequence id of the WAL entry.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final Connectionprivate static final org.slf4j.Loggerprivate final Stringprivate final org.apache.hbase.thirdparty.com.google.common.cache.LoadingCache<String,org.apache.commons.lang3.mutable.MutableLong> static final longstatic final Stringprivate final ReplicationQueueStorageprivate final long -
Constructor Summary
ConstructorsConstructorDescriptionSerialReplicationChecker(org.apache.hadoop.conf.Configuration conf, ReplicationSource source) -
Method Summary
Modifier and TypeMethodDescriptionprivate booleanbooleanprivate booleanisLastRangeAndOpening(MetaTableAccessor.ReplicationBarrierResult barrierResult, int index) private booleanisParentFinished(byte[] regionName) private booleanisRangeFinished(long endBarrier, String encodedRegionName) private voidrecordCanPush(String encodedNameAsString, long seqId, long[] barriers, int index) voidwaitUntilCanPush(WAL.Entry entry, Cell firstCellInEdit)
-
Field Details
-
LOG
-
REPLICATION_SERIALLY_WAITING_KEY
- See Also:
-
REPLICATION_SERIALLY_WAITING_DEFAULT
- See Also:
-
peerId
-
storage
-
conn
-
waitTimeMs
-
pushed
-
canPushUnder
-
-
Constructor Details
-
SerialReplicationChecker
public SerialReplicationChecker(org.apache.hadoop.conf.Configuration conf, ReplicationSource source)
-
-
Method Details
-
isRangeFinished
- Throws:
IOException
-
isParentFinished
- Throws:
IOException
-
isLastRangeAndOpening
private boolean isLastRangeAndOpening(MetaTableAccessor.ReplicationBarrierResult barrierResult, int index) -
recordCanPush
-
canPush
- Throws:
IOException
-
canPush
- Throws:
IOException
-
waitUntilCanPush
public void waitUntilCanPush(WAL.Entry entry, Cell firstCellInEdit) throws IOException, InterruptedException - Throws:
IOExceptionInterruptedException
-