@InterfaceAudience.Private public class TableSnapshotScanner extends AbstractClientScanner
This also allows one to run the scan from an online or offline hbase cluster. The snapshot files can be exported by using the org.apache.hadoop.hbase.snapshot.ExportSnapshot tool, to a pure-hdfs cluster, and this scanner can be used to run the scan directly over the snapshot files. The snapshot should not be deleted while there are open scanners reading from snapshot files.
An internal RegionScanner is used to execute the Scan
obtained from the user for each
region in the snapshot.
HBase owns all the data and snapshot files on the filesystem. Only the HBase user can read from snapshot files and data files. HBase also enforces security because all the requests are handled by the server layer, and the user cannot read from the data files directly. To read from snapshot files directly from the file system, the user who is running the MR job must have sufficient permissions to access snapshot and reference files. This means that to run mapreduce over snapshot files, the job has to be run as the HBase user or the user must have group or other priviledges in the filesystem (See HBASE-8369). Note that, given other users access to read from snapshot/data files will completely circumvent the access control enforced by HBase. See org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.
Modifier and Type | Field and Description |
---|---|
private org.apache.hadoop.conf.Configuration |
conf |
private int |
currentRegion |
private ClientSideRegionScanner |
currentRegionScanner |
private org.apache.hadoop.fs.FileSystem |
fs |
private TableDescriptor |
htd |
private static org.slf4j.Logger |
LOG |
private int |
numOfCompleteRows |
private ArrayList<RegionInfo> |
regions |
private org.apache.hadoop.fs.Path |
restoreDir |
private org.apache.hadoop.fs.Path |
rootDir |
private Scan |
scan |
private boolean |
snapshotAlreadyRestored |
private String |
snapshotName |
scanMetrics
Constructor and Description |
---|
TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path rootDir,
org.apache.hadoop.fs.Path restoreDir,
String snapshotName,
Scan scan) |
TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path rootDir,
org.apache.hadoop.fs.Path restoreDir,
String snapshotName,
Scan scan,
boolean snapshotAlreadyRestored)
Creates a TableSnapshotScanner.
|
TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path restoreDir,
String snapshotName,
Scan scan)
Creates a TableSnapshotScanner.
|
Modifier and Type | Method and Description |
---|---|
private void |
cleanup() |
void |
close()
Closes the scanner and releases any resources it has allocated
|
private boolean |
isValidRegion(RegionInfo hri) |
Result |
next()
Grab the next row's worth of values.
|
private void |
openWithoutRestoringSnapshot() |
private void |
openWithRestoringSnapshot() |
boolean |
renewLease()
Allow the client to renew the scanner's lease on the server.
|
getScanMetrics, initScanMetrics
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
iterator, next
forEach, spliterator
private static final org.slf4j.Logger LOG
private org.apache.hadoop.conf.Configuration conf
private String snapshotName
private org.apache.hadoop.fs.FileSystem fs
private org.apache.hadoop.fs.Path rootDir
private org.apache.hadoop.fs.Path restoreDir
private ArrayList<RegionInfo> regions
private TableDescriptor htd
private final boolean snapshotAlreadyRestored
private ClientSideRegionScanner currentRegionScanner
private int currentRegion
private int numOfCompleteRows
public TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan) throws IOException
conf
- the configurationrestoreDir
- a temporary directory to copy the snapshot files into. Current user should
have write permissions to this directory, and this should not be a
subdirectory of rootDir. The scanner deletes the contents of the directory
once the scanner is closed.snapshotName
- the name of the snapshot to read fromscan
- a Scan representing scan parametersIOException
- in case of errorpublic TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan) throws IOException
IOException
public TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan, boolean snapshotAlreadyRestored) throws IOException
conf
- the configurationrootDir
- root directory for HBase.restoreDir
- a temporary directory to copy the snapshot files into. Current
user should have write permissions to this directory, and this
should not be a subdirectory of rootdir. The scanner deletes the
contents of the directory once the scanner is closed.snapshotName
- the name of the snapshot to read fromscan
- a Scan representing scan parameterssnapshotAlreadyRestored
- true to indicate that snapshot has been restored.IOException
- in case of errorprivate void openWithoutRestoringSnapshot() throws IOException
IOException
private boolean isValidRegion(RegionInfo hri)
private void openWithRestoringSnapshot() throws IOException
IOException
public Result next() throws IOException
ResultScanner
IOException
- eprivate void cleanup()
public void close()
ResultScanner
public boolean renewLease()
ResultScanner
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.