MultiTableSnapshotInputFormat (Apache HBase 2.0.6 API)

java.lang.Object
- org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat
- - org.apache.hadoop.hbase.mapred.MultiTableSnapshotInputFormat

All Implemented Interfaces:

org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
```
@InterfaceAudience.Public
public class MultiTableSnapshotInputFormat
extends TableSnapshotInputFormat
implements org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
```
MultiTableSnapshotInputFormat generalizes .TableSnapshotInputFormat allowing a MapReduce job to run over one or more table snapshots, with one or more scans configured for each. Internally, the input format delegates to .TableSnapshotInputFormat and thus has the same performance advantages; see .TableSnapshotInputFormat for more details. Usage is similar to TableSnapshotInputFormat, with the following exception: initMultiTableSnapshotMapperJob takes in a map from snapshot name to a collection of scans. For each snapshot in the map, each corresponding scan will be applied; the overall dataset for the job is defined by the concatenation of the regions and tables included in each snapshot/scan pair. TableMapReduceUtil.initMultiTableSnapshotMapperJob(Map, Class, Class, Class, JobConf, boolean, Path) can be used to configure the job.
```
 Job job = new Job(conf);
 Map<String, Collection<Scan>> snapshotScans = ImmutableMap.of(
    "snapshot1", ImmutableList.of(new Scan(Bytes.toBytes("a"), Bytes.toBytes("b"))),
    "snapshot2", ImmutableList.of(new Scan(Bytes.toBytes("1"), Bytes.toBytes("2")))
 );
 Path restoreDir = new Path("/tmp/snapshot_restore_dir")
 TableMapReduceUtil.initTableSnapshotMapperJob(
     snapshotScans, MyTableMapper.class, MyMapKeyOutput.class,
      MyMapOutputValueWritable.class, job, true, restoreDir);
 
 
```
Internally, this input format restores each snapshot into a subdirectory of the given tmp directory. Input splits and record readers are created as described in .TableSnapshotInputFormat (one per region). See TableSnapshotInputFormat for more notes on permissioning; the same caveats apply here.
See Also:

TableSnapshotInputFormat, TableSnapshotScanner

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat
  TableSnapshotInputFormat.TableSnapshotRecordReader, TableSnapshotInputFormat.TableSnapshotRegionSplit

Field Summary

Fields
Modifier and Type Field and Description

private MultiTableSnapshotInputFormatImpl delegate

Fields
Modifier and Type	Field and Description
`private MultiTableSnapshotInputFormatImpl`	`delegate`

Constructor Summary

Constructors
Constructor and Description

MultiTableSnapshotInputFormat()

Constructors
Constructor and Description
`MultiTableSnapshotInputFormat()`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapred.RecordReader<ImmutableBytesWritable,Result>`	`getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)`
`org.apache.hadoop.mapred.InputSplit[]`	`getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)`
`static void`	`setInput(org.apache.hadoop.conf.Configuration conf, Map<String,Collection<Scan>> snapshotScans, org.apache.hadoop.fs.Path restoreDir)` Configure conf to read from snapshotScans, with snapshots restored to a subdirectory of restoreDir.

Methods inherited from class org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat
setInput, setInput

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

delegate

private final MultiTableSnapshotInputFormatImpl delegate

Constructor Detail
- MultiTableSnapshotInputFormat
```
public MultiTableSnapshotInputFormat()
```

Method Detail

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws IOException

Specified by:: getSplits in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
Overrides:: getSplits in class TableSnapshotInputFormat
Throws:: IOException

getRecordReader

public org.apache.hadoop.mapred.RecordReader<ImmutableBytesWritable,Result> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                            org.apache.hadoop.mapred.JobConf job,
                                                                                            org.apache.hadoop.mapred.Reporter reporter)
                                                                                     throws IOException

Specified by:: getRecordReader in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
Overrides:: getRecordReader in class TableSnapshotInputFormat
Throws:: IOException

setInput

public static void setInput(org.apache.hadoop.conf.Configuration conf,
                            Map<String,Collection<Scan>> snapshotScans,
                            org.apache.hadoop.fs.Path restoreDir)
                     throws IOException

Configure conf to read from snapshotScans, with snapshots restored to a subdirectory of restoreDir. Sets: .MultiTableSnapshotInputFormatImpl#RESTORE_DIRS_KEY, .MultiTableSnapshotInputFormatImpl#SNAPSHOT_TO_SCANS_KEY

Parameters:: conf -; snapshotScans -; restoreDir -
Throws:: IOException

Class MultiTableSnapshotInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat

Methods inherited from class java.lang.Object