MultiTableInputFormatBase (Apache HBase 2.5.0 API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
- - org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase

Direct Known Subclasses:

MultiTableInputFormat
```
@InterfaceAudience.Public
public abstract class MultiTableInputFormatBase
extends org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
```
A base for MultiTableInputFormats. Receives a list of Scan instances that define the input tables and filters etc. Subclasses may use other TableRecordReader implementations.

Constructor Summary

Constructors
Constructor and Description

MultiTableInputFormatBase()

Constructors
Constructor and Description
`MultiTableInputFormatBase()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)` Builds a TableRecordReader.
`protected List<Scan>`	`getScans()` Allows subclasses to get the list of `Scan` objects.
`List<org.apache.hadoop.mapreduce.InputSplit>`	`getSplits(org.apache.hadoop.mapreduce.JobContext context)` Calculates the splits that will serve as input for the map tasks.
`protected boolean`	`includeRegionInSplit(byte[] startKey, byte[] endKey)` Test if the given region is to be included in the InputSplit while splitting the regions of a table.
`protected void`	`setScans(List<Scan> scans)` Allows subclasses to set the list of `Scan` objects.
`protected void`	`setTableRecordReader(TableRecordReader tableRecordReader)` Allows subclasses to set the `TableRecordReader`.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - MultiTableInputFormatBase
```
public MultiTableInputFormatBase()
```
- Method Detail
  - createRecordReader
```
public org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                  org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                           throws IOException,
                                                                                                  InterruptedException
```
    Builds a TableRecordReader. If no TableRecordReader was provided, uses the default.
    
    Specified by:
    
    createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
    
    Parameters:
    
    split - The split to work with.
    
    context - The current context.
    
    Returns:
    
    The newly created record reader.
    
    Throws:
    
    IOException - When creating the reader fails.
    
    InterruptedException - when record reader initialization fails
    
    See Also:
    
    InputFormat.createRecordReader(InputSplit, TaskAttemptContext)
  - getSplits
```
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
                                                       throws IOException
```
    Calculates the splits that will serve as input for the map tasks. The number of splits matches the number of regions in a table.
    
    Specified by:
    
    getSplits in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
    
    Parameters:
    
    context - The current job context.
    
    Returns:
    
    The list of input splits.
    
    Throws:
    
    IOException - When creating the list of splits fails.
    
    See Also:
    
    InputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)
  - includeRegionInSplit
```
protected boolean includeRegionInSplit(byte[] startKey,
                                       byte[] endKey)
```
    Test if the given region is to be included in the InputSplit while splitting the regions of a table.
    This optimization is effective when there is a specific reasoning to exclude an entire region from the M-R job, (and hence, not contributing to the InputSplit), given the start and end keys of the same.
    Useful when we need to remember the last-processed top record and revisit the [last, current) interval for M-R processing, continuously. In addition to reducing InputSplits, reduces the load on the region server as well, due to the ordering of the keys.
    
    Note: It is possible that endKey.length() == 0 , for the last (recent) region.
    Override this method, if you want to bulk exclude regions altogether from M-R. By default, no region is excluded( i.e. all regions are included).
    
    Parameters:
    
    startKey - Start key of the region
    
    endKey - End key of the region
    
    Returns:
    
    true, if this region needs to be included as part of the input (default).
  - getScans
```
protected List<Scan> getScans()
```
    Allows subclasses to get the list of Scan objects.
  - setScans
```
protected void setScans(List<Scan> scans)
```
    Allows subclasses to set the list of Scan objects.
    
    Parameters:
    
    scans - The list of Scan used to define the input
  - setTableRecordReader
```
protected void setTableRecordReader(TableRecordReader tableRecordReader)
```
    Allows subclasses to set the TableRecordReader.
    
    Parameters:
    
    tableRecordReader - A different TableRecordReader implementation.

Class MultiTableInputFormatBase

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

MultiTableInputFormatBase

Method Detail

createRecordReader

getSplits

includeRegionInSplit

getScans

setScans

setTableRecordReader