Class SyncReplicationReplayWALManager

java.lang.Object
org.apache.hadoop.hbase.master.replication.SyncReplicationReplayWALManager

@Private public class SyncReplicationReplayWALManager extends Object
The manager for replaying remote wal.

First, it will be used to balance the replay work across all the region servers. We will record the region servers which have already been used for replaying wal, and prevent sending new replay work to it, until the previous replay work has been done, where we will remove the region server from the used worker set. See the comment for UsedReplayWorkersForPeer for more details.

Second, the logic for managing the remote wal directory is kept here. Before replaying the wals, we will rename the remote wal directory, the new name is called 'replay' directory, see renameToPeerReplayWALDir(String). This is used to prevent further writing of remote wals, which is very important for keeping consistency. And then we will start replaying all the wals, once a wal has been replayed, we will truncate the file, so that if there are crashes happen, we do not need to replay all the wals again, see finishReplayWAL(String) and isReplayWALFinished(String). After replaying all the wals, we will rename the 'replay' directory, the new name is called 'snapshot' directory. In the directory, we will keep all the names for the wals being replayed, since all the files should have been truncated. When we transitting original the ACTIVE cluster to STANDBY later, and there are region server crashes, we will see the wals in this directory to determine whether a wal should be split and replayed or not. You can see the code in SplitLogWorker for more details.