MultiTableInputFormatBase (Apache HBase 1.2.12 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
- - org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase

Direct Known Subclasses:

MultiTableInputFormat
```
@InterfaceAudience.Public
@InterfaceStability.Evolving
public abstract class MultiTableInputFormatBase
extends org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
```
A base for MultiTableInputFormats. Receives a list of Scan instances that define the input tables and filters etc. Subclasses may use other TableRecordReader implementations.

Field Summary

Fields
Modifier and Type	Field and Description
`private static org.apache.commons.logging.Log`	`LOG`
`private List<Scan>`	`scans` Holds the set of scans used to define the input.
`private TableRecordReader`	`tableRecordReader` The reader scanning the table, can be a custom one.

Constructor Summary

Constructors
Constructor and Description

MultiTableInputFormatBase()

Constructors
Constructor and Description
`MultiTableInputFormatBase()`

Method Summary

Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)` Builds a TableRecordReader.
`protected List<Scan>`	`getScans()` Allows subclasses to get the list of `Scan` objects.
`List<org.apache.hadoop.mapreduce.InputSplit>`	`getSplits(org.apache.hadoop.mapreduce.JobContext context)` Calculates the splits that will serve as input for the map tasks.
`protected boolean`	`includeRegionInSplit(byte[] startKey, byte[] endKey)` Test if the given region is to be included in the InputSplit while splitting the regions of a table.
`protected void`	`setScans(List<Scan> scans)` Allows subclasses to set the list of `Scan` objects.
`protected void`	`setTableRecordReader(TableRecordReader tableRecordReader)` Allows subclasses to set the `TableRecordReader`.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
private static final org.apache.commons.logging.Log LOG
```
  - scans
```
private List<Scan> scans
```
    Holds the set of scans used to define the input.
  - tableRecordReader
```
private TableRecordReader tableRecordReader
```
    The reader scanning the table, can be a custom one.
- Constructor Detail
  - MultiTableInputFormatBase
```
public MultiTableInputFormatBase()
```
- Method Detail
  - createRecordReader
```
public org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                         org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                           throws IOException,
                                                                                                  InterruptedException
```
    Builds a TableRecordReader. If no TableRecordReader was provided, uses the default.
    
    Specified by:
    
    createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
    
    Parameters:
    split - The split to work with.
    context - The current context.
    
    Returns:
    The newly created record reader.
    
    Throws:
    
    IOException - When creating the reader fails.
    
    InterruptedException - when record reader initialization fails
    See Also:
    InputFormat.createRecordReader( org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)
  - getSplits
```
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
                                                       throws IOException
```
    Calculates the splits that will serve as input for the map tasks. The number of splits matches the number of regions in a table.
    
    Specified by:
    
    getSplits in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
    
    Parameters:
    context - The current job context.
    
    Returns:
    The list of input splits.
    
    Throws:
    
    IOException - When creating the list of splits fails.
    See Also:
    InputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)
  - includeRegionInSplit
```
protected boolean includeRegionInSplit(byte[] startKey,
                           byte[] endKey)
```
    Test if the given region is to be included in the InputSplit while splitting the regions of a table.
    This optimization is effective when there is a specific reasoning to exclude an entire region from the M-R job, (and hence, not contributing to the InputSplit), given the start and end keys of the same.
    Useful when we need to remember the last-processed top record and revisit the [last, current) interval for M-R processing, continuously. In addition to reducing InputSplits, reduces the load on the region server as well, due to the ordering of the keys.
    
    Note: It is possible that endKey.length() == 0 , for the last (recent) region.
    Override this method, if you want to bulk exclude regions altogether from M-R. By default, no region is excluded( i.e. all regions are included).
    
    Parameters:
    startKey - Start key of the region
    endKey - End key of the region
    
    Returns:
    true, if this region needs to be included as part of the input (default).
  - getScans
```
protected List<Scan> getScans()
```
    Allows subclasses to get the list of Scan objects.
  - setScans
```
protected void setScans(List<Scan> scans)
```
    Allows subclasses to set the list of Scan objects.
    
    Parameters:
    scans - The list of Scan used to define the input
  - setTableRecordReader
```
protected void setTableRecordReader(TableRecordReader tableRecordReader)
```
    Allows subclasses to set the TableRecordReader.
    
    Parameters:
    tableRecordReader - A different TableRecordReader implementation.

Class MultiTableInputFormatBase

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

LOG

scans

tableRecordReader

Constructor Detail

MultiTableInputFormatBase

Method Detail

createRecordReader

getSplits

includeRegionInSplit

getScans

setScans

setTableRecordReader