@InterfaceAudience.Public public class MultiTableSnapshotInputFormat extends TableSnapshotInputFormat implements org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
TableSnapshotInputFormat
 allowing a MapReduce job to run over one or more table snapshots, with one or more scans
 configured for each.
 Internally, the input format delegates to
 TableSnapshotInputFormat
 and thus has the same performance advantages; see
 TableSnapshotInputFormat
 for more details.
 Usage is similar to TableSnapshotInputFormat, with the following exception:
 initMultiTableSnapshotMapperJob takes in a map
 from snapshot name to a collection of scans. For each snapshot in the map, each corresponding
 scan will be applied;
 the overall dataset for the job is defined by the concatenation of the regions and tables
 included in each snapshot/scan
 pair.
 TableMapReduceUtil.initMultiTableSnapshotMapperJob(Map,
 Class, Class, Class, JobConf, boolean, Path)
 can be used to configure the job.
 
 Job job = new Job(conf);
 Map<String, Collection<Scan>> snapshotScans = ImmutableMap.of(
    "snapshot1", ImmutableList.of(new Scan(Bytes.toBytes("a"), Bytes.toBytes("b"))),
    "snapshot2", ImmutableList.of(new Scan(Bytes.toBytes("1"), Bytes.toBytes("2")))
 );
 Path restoreDir = new Path("/tmp/snapshot_restore_dir")
 TableMapReduceUtil.initTableSnapshotMapperJob(
     snapshotScans, MyTableMapper.class, MyMapKeyOutput.class,
      MyMapOutputValueWritable.class, job, true, restoreDir);
 
 TableSnapshotInputFormat
 (one per region).
 See TableSnapshotInputFormat for more notes on
 permissioning; the
 same caveats apply here.TableSnapshotInputFormat, 
TableSnapshotScannerTableSnapshotInputFormat.TableSnapshotRecordReader, TableSnapshotInputFormat.TableSnapshotRegionSplit| Modifier and Type | Field and Description | 
|---|---|
| private MultiTableSnapshotInputFormatImpl | delegate | 
| Constructor and Description | 
|---|
| MultiTableSnapshotInputFormat() | 
| Modifier and Type | Method and Description | 
|---|---|
| org.apache.hadoop.mapred.RecordReader<ImmutableBytesWritable,Result> | getRecordReader(org.apache.hadoop.mapred.InputSplit split,
               org.apache.hadoop.mapred.JobConf job,
               org.apache.hadoop.mapred.Reporter reporter) | 
| org.apache.hadoop.mapred.InputSplit[] | getSplits(org.apache.hadoop.mapred.JobConf job,
         int numSplits) | 
| static void | setInput(org.apache.hadoop.conf.Configuration conf,
        Map<String,Collection<Scan>> snapshotScans,
        org.apache.hadoop.fs.Path restoreDir) | 
setInput, setInputprivate final MultiTableSnapshotInputFormatImpl delegate
public MultiTableSnapshotInputFormat()
public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits) throws IOException
getSplits in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>getSplits in class TableSnapshotInputFormatIOExceptionpublic org.apache.hadoop.mapred.RecordReader<ImmutableBytesWritable,Result> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter) throws IOException
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>getRecordReader in class TableSnapshotInputFormatIOExceptionpublic static void setInput(org.apache.hadoop.conf.Configuration conf, Map<String,Collection<Scan>> snapshotScans, org.apache.hadoop.fs.Path restoreDir) throws IOException
IOExceptionCopyright © 2007–2021 The Apache Software Foundation. All rights reserved.