@InterfaceAudience.Public public class TableInputFormat extends TableInputFormatBase implements org.apache.hadoop.conf.Configurable
| Modifier and Type | Field and Description | 
|---|---|
| private org.apache.hadoop.conf.Configuration | confThe configuration. | 
| static String | INPUT_TABLEJob parameter that specifies the input table. | 
| private static org.slf4j.Logger | LOG | 
| static String | SCANBase-64 encoded scanner. | 
| static String | SCAN_BATCHSIZESet the maximum number of values to return for each call to next(). | 
| static String | SCAN_CACHEBLOCKSSet to false to disable server-side caching of blocks for this scan. | 
| static String | SCAN_CACHEDROWSThe number of rows for caching that will be passed to scanners. | 
| static String | SCAN_COLUMN_FAMILYColumn Family to Scan | 
| static String | SCAN_COLUMNSSpace delimited list of columns and column families to scan. | 
| static String | SCAN_MAXVERSIONSThe maximum number of version to return. | 
| static String | SCAN_ROW_STARTScan start row | 
| static String | SCAN_ROW_STOPScan stop row | 
| static String | SCAN_TIMERANGE_ENDThe ending timestamp used to filter columns with a specific range of versions. | 
| static String | SCAN_TIMERANGE_STARTThe starting timestamp used to filter columns with a specific range of versions. | 
| static String | SCAN_TIMESTAMPThe timestamp used to filter columns with a specific timestamp. | 
| static String | SHUFFLE_MAPSSpecify if we have to shuffle the map tasks. | 
| private static String | SPLIT_TABLEIf specified, use start keys of this table to split. | 
MAPREDUCE_INPUT_AUTOBALANCE, MAX_AVERAGE_REGION_SIZE, NUM_MAPPERS_PER_REGION| Constructor and Description | 
|---|
| TableInputFormat() | 
| Modifier and Type | Method and Description | 
|---|---|
| private static void | addColumn(Scan scan,
         byte[] familyAndQualifier)Parses a combined family and qualifier and adds either both or just the
 family in case there is no qualifier. | 
| static void | addColumns(Scan scan,
          byte[][] columns)Adds an array of columns specified using old format, family:qualifier. | 
| private static void | addColumns(Scan scan,
          String columns)Convenience method to parse a string representation of an array of column specifiers. | 
| static void | configureSplitTable(org.apache.hadoop.mapreduce.Job job,
                   TableName tableName)Sets split table in map-reduce job. | 
| static Scan | createScanFromConfiguration(org.apache.hadoop.conf.Configuration conf)Sets up a  Scaninstance, applying settings from the configuration property
 constants defined inTableInputFormat. | 
| org.apache.hadoop.conf.Configuration | getConf()Returns the current configuration. | 
| List<org.apache.hadoop.mapreduce.InputSplit> | getSplits(org.apache.hadoop.mapreduce.JobContext context)Calculates the splits that will serve as input for the map tasks. | 
| protected Pair<byte[][],byte[][]> | getStartEndKeys() | 
| protected void | initialize(org.apache.hadoop.mapreduce.JobContext context)Handle subclass specific set up. | 
| void | setConf(org.apache.hadoop.conf.Configuration configuration)Sets the configuration. | 
calculateAutoBalancedSplits, closeTable, createNInputSplitsUniform, createRecordReader, createRegionSizeCalculator, getAdmin, getRegionLocator, getScan, getTable, includeRegionInSplit, initializeTable, reverseDNS, setScan, setTableRecordReaderprivate static final org.slf4j.Logger LOG
public static final String INPUT_TABLE
private static final String SPLIT_TABLE
public static final String SCAN
TableMapReduceUtil.convertScanToString(Scan) for more details.public static final String SCAN_ROW_START
public static final String SCAN_ROW_STOP
public static final String SCAN_COLUMN_FAMILY
public static final String SCAN_COLUMNS
public static final String SCAN_TIMESTAMP
public static final String SCAN_TIMERANGE_START
public static final String SCAN_TIMERANGE_END
public static final String SCAN_MAXVERSIONS
public static final String SCAN_CACHEBLOCKS
public static final String SCAN_CACHEDROWS
public static final String SCAN_BATCHSIZE
public static final String SHUFFLE_MAPS
private org.apache.hadoop.conf.Configuration conf
public TableInputFormat()
public org.apache.hadoop.conf.Configuration getConf()
getConf in interface org.apache.hadoop.conf.ConfigurableConfigurable.getConf()public void setConf(org.apache.hadoop.conf.Configuration configuration)
setConf in interface org.apache.hadoop.conf.Configurableconfiguration - The configuration to set.Configurable.setConf(
   org.apache.hadoop.conf.Configuration)public static Scan createScanFromConfiguration(org.apache.hadoop.conf.Configuration conf) throws IOException
Scan instance, applying settings from the configuration property
 constants defined in TableInputFormat.  This allows specifying things such as:
 IOExceptionprotected void initialize(org.apache.hadoop.mapreduce.JobContext context) throws IOException
TableInputFormatBaseTableInputFormatBase.createRecordReader(InputSplit, TaskAttemptContext) and TableInputFormatBase.getSplits(JobContext),
 will call TableInputFormatBase.initialize(JobContext) as a convenient centralized location to handle
 retrieving the necessary configuration information and calling
 TableInputFormatBase.initializeTable(Connection, TableName).
 Subclasses should implement their initialize call such that it is safe to call multiple times.
 The current TableInputFormatBase implementation relies on a non-null table reference to decide
 if an initialize call is needed, but this behavior may change in the future. In particular,
 it is critical that initializeTable not be called multiple times since this will leak
 Connection instances.initialize in class TableInputFormatBaseIOExceptionprivate static void addColumn(Scan scan, byte[] familyAndQualifier)
scan - The Scan to update.familyAndQualifier - family and qualifierIllegalArgumentException - When familyAndQualifier is invalid.public static void addColumns(Scan scan, byte[][] columns)
 Overrides previous calls to Scan.addColumn(byte[], byte[])for any families in the
 input.
scan - The Scan to update.columns - array of columns, formatted as family:qualifierScan.addColumn(byte[], byte[])public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException
getSplits in class TableInputFormatBasecontext - The current job context.IOException - When creating the list of splits fails.InputFormat.getSplits(
   org.apache.hadoop.mapreduce.JobContext)private static void addColumns(Scan scan, String columns)
scan - The Scan to update.columns - The columns to parse.protected Pair<byte[][],byte[][]> getStartEndKeys() throws IOException
getStartEndKeys in class TableInputFormatBaseIOExceptionpublic static void configureSplitTable(org.apache.hadoop.mapreduce.Job job, TableName tableName)
Copyright © 2007–2021 The Apache Software Foundation. All rights reserved.