Class TableSnapshotInputFormat

java.lang.Object
org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
Direct Known Subclasses:
MultiTableSnapshotInputFormat

@Public public class TableSnapshotInputFormat extends Object implements org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
TableSnapshotInputFormat allows a MapReduce job to run over a table snapshot. Further documentation available on TableSnapshotInputFormat.
See Also:
  • Constructor Details

  • Method Details

    • getSplits

      public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits) throws IOException
      Specified by:
      getSplits in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
      Throws:
      IOException
    • getRecordReader

      public org.apache.hadoop.mapred.RecordReader<ImmutableBytesWritable,Result> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter) throws IOException
      Specified by:
      getRecordReader in interface org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
      Throws:
      IOException
    • setInput

      public static void setInput(org.apache.hadoop.mapred.JobConf job, String snapshotName, org.apache.hadoop.fs.Path restoreDir) throws IOException
      Configures the job to use TableSnapshotInputFormat to read from a snapshot.
      Parameters:
      job - the job to configure
      snapshotName - the name of the snapshot to read from
      restoreDir - a temporary directory to restore the snapshot into. Current user should have write permissions to this directory, and this should not be a subdirectory of rootdir. After the job is finished, restoreDir can be deleted.
      Throws:
      IOException - if an error occurs
    • setInput

      public static void setInput(org.apache.hadoop.mapred.JobConf job, String snapshotName, org.apache.hadoop.fs.Path restoreDir, RegionSplitter.SplitAlgorithm splitAlgo, int numSplitsPerRegion) throws IOException
      Configures the job to use TableSnapshotInputFormat to read from a snapshot.
      Parameters:
      job - the job to configure
      snapshotName - the name of the snapshot to read from
      restoreDir - a temporary directory to restore the snapshot into. Current user should have write permissions to this directory, and this should not be a subdirectory of rootdir. After the job is finished, restoreDir can be deleted.
      splitAlgo - split algorithm to generate splits from region
      numSplitsPerRegion - how many input splits to generate per one region
      Throws:
      IOException - if an error occurs