Class RegionServerSnapshotManager
java.lang.Object
org.apache.hadoop.hbase.procedure.ProcedureManager
org.apache.hadoop.hbase.procedure.RegionServerProcedureManager
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager
@LimitedPrivate("Configuration")
@Unstable
public class RegionServerSnapshotManager
extends RegionServerProcedureManager
This manager class handles the work dealing with snapshots for a
HRegionServer
.
This provides the mechanism necessary to kick off a online snapshot specific Subprocedure
that is responsible for the regions being served by this region server. If any failures occur
with the subprocedure, the RegionSeverSnapshotManager's subprocedure handler,
ProcedureMember
, notifies the master's ProcedureCoordinator to abort all others.
On startup, requires start()
to be called.
On shutdown, requires stop(boolean)
to be called
-
Nested Class Summary
Modifier and TypeClassDescriptionclass
Build the actual snapshot runner that will do all the 'hard' work(package private) static class
We use the SnapshotSubprocedurePool, a class specific thread pool instead ofExecutorService
. -
Field Summary
Modifier and TypeFieldDescriptionprivate static final String
Maximum number of snapshot region tasks that can run concurrentlyprivate static final int
private static final org.slf4j.Logger
private ProcedureMember
private ProcedureMemberRpcs
private RegionServerServices
static final int
# of threads for snapshotting regions on the rs.static final String
Conf key for number of request threads to start snapshots on regionserversprivate static final long
Default amount of time to check for errors while regions finish snapshottingstatic final String
Conf key for millis between checks to see if snapshot completed or if there are errorsstatic final long
Keep threads alive in request pool for max of 300 secondsstatic final String
Conf key for max time to keep threads in snapshot request pool waiting -
Constructor Summary
ConstructorDescriptionRegionServerSnapshotManager
(org.apache.hadoop.conf.Configuration conf, HRegionServer parent, ProcedureMemberRpcs memberRpc, ProcedureMember procMember) Exposed for testing. -
Method Summary
Modifier and TypeMethodDescriptionbuildSubprocedure
(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription snapshot) If in a running state, creates the specified subprocedure for handling an online snapshot.Return the unique signature of the procedure.getRegionsToSnapshot
(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription snapshot) Determine if the snapshot should be handled on this server NOTE: This is racy -- the master expects a list of regionservers.void
Create a default snapshot handler - uses a zookeeper based member controller.void
start()
Start accepting snapshot requests.void
stop
(boolean force) Close this and all running snapshot tasksMethods inherited from class org.apache.hadoop.hbase.procedure.ProcedureManager
equals, hashCode
-
Field Details
-
LOG
-
CONCURENT_SNAPSHOT_TASKS_KEY
Maximum number of snapshot region tasks that can run concurrently- See Also:
-
DEFAULT_CONCURRENT_SNAPSHOT_TASKS
- See Also:
-
SNAPSHOT_REQUEST_THREADS_KEY
Conf key for number of request threads to start snapshots on regionservers- See Also:
-
SNAPSHOT_REQUEST_THREADS_DEFAULT
# of threads for snapshotting regions on the rs.- See Also:
-
SNAPSHOT_TIMEOUT_MILLIS_KEY
Conf key for max time to keep threads in snapshot request pool waiting- See Also:
-
SNAPSHOT_TIMEOUT_MILLIS_DEFAULT
Keep threads alive in request pool for max of 300 seconds- See Also:
-
SNAPSHOT_REQUEST_WAKE_MILLIS_KEY
Conf key for millis between checks to see if snapshot completed or if there are errors- See Also:
-
SNAPSHOT_REQUEST_WAKE_MILLIS_DEFAULT
Default amount of time to check for errors while regions finish snapshotting- See Also:
-
rss
-
memberRpcs
-
member
-
-
Constructor Details
-
RegionServerSnapshotManager
RegionServerSnapshotManager(org.apache.hadoop.conf.Configuration conf, HRegionServer parent, ProcedureMemberRpcs memberRpc, ProcedureMember procMember) Exposed for testing.- Parameters:
conf
- HBase configuration.parent
- parent running the snapshot handlermemberRpc
- use specified memberRpc instanceprocMember
- use specified ProcedureMember
-
RegionServerSnapshotManager
public RegionServerSnapshotManager()
-
-
Method Details
-
start
Start accepting snapshot requests.- Specified by:
start
in classRegionServerProcedureManager
-
stop
Close this and all running snapshot tasks- Specified by:
stop
in classRegionServerProcedureManager
- Parameters:
force
- forcefully stop all running tasks- Throws:
IOException
-
buildSubprocedure
public Subprocedure buildSubprocedure(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription snapshot) If in a running state, creates the specified subprocedure for handling an online snapshot. Because this gets the local list of regions to snapshot and not the set the master had, there is a possibility of a race where regions may be missed. This detected by the master in the snapshot verification step.- Returns:
- Subprocedure to submit to the ProcedureMember.
-
getRegionsToSnapshot
private List<HRegion> getRegionsToSnapshot(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription snapshot) throws IOException Determine if the snapshot should be handled on this server NOTE: This is racy -- the master expects a list of regionservers. This means if a region moves somewhere between the calls we'll miss some regions. For example, a region move during a snapshot could result in a region to be skipped or done twice. This is manageable because theMasterSnapshotVerifier
will double check the region lists after the online portion of the snapshot completes and will explicitly fail the snapshot.- Returns:
- the list of online regions. Empty list is returned if no regions are responsible for the given snapshot.
- Throws:
IOException
-
initialize
Create a default snapshot handler - uses a zookeeper based member controller.- Specified by:
initialize
in classRegionServerProcedureManager
- Parameters:
rss
- region server running the handler- Throws:
org.apache.zookeeper.KeeperException
- if the zookeeper cluster cannot be reached
-
getProcedureSignature
Description copied from class:ProcedureManager
Return the unique signature of the procedure. This signature uniquely identifies the procedure. By default, this signature is the string used in the procedure controller (i.e., the root ZK node name for the procedure)- Specified by:
getProcedureSignature
in classProcedureManager
-