Class TableOutputFormat<KEY>
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation>
org.apache.hadoop.hbase.mapreduce.TableOutputFormat<KEY>
- All Implemented Interfaces:
org.apache.hadoop.conf.Configurable
@Public
public class TableOutputFormat<KEY>
extends org.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation>
implements org.apache.hadoop.conf.Configurable
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected classWrites the reducer output to an HBase table. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate org.apache.hadoop.conf.ConfigurationThe configuration.private static final org.slf4j.Loggerstatic final StringOptional job parameter to specify a peer cluster.static final StringThe configuration key for specifying a customOutputCommitterimplementation to be used byTableOutputFormat.static final StringDeprecated.Since 3.0.0, will be removed in 4.0.0.static final StringJob parameter that specifies the output table.static final StringDeprecated.Since 3.0.0, will be removed in 4.0.0.static final StringDeprecated.Since 3.0.0, will be removed in 4.0.0.static final StringDeprecated.Since 2.5.9, 2.6.1 and 2.7.0, will be removed in 4.0.0.static final StringDeprecated.Since 2.5.9, 2.6.1 and 2.7.0, will be removed in 4.0.0.static final booleanProperty value to disable write-ahead loggingstatic final booleanProperty value to use write-ahead loggingstatic final StringSet this toWAL_OFFto turn off write-ahead logging (WAL) -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckOutputSpecs(org.apache.hadoop.mapreduce.JobContext context) Checks if the output table exists and is enabled.private static ConnectioncreateConnection(org.apache.hadoop.conf.Configuration conf) org.apache.hadoop.conf.ConfigurationgetConf()org.apache.hadoop.mapreduce.OutputCommittergetOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context) Returns the output committer.getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context) Creates a new record writer.voidsetConf(org.apache.hadoop.conf.Configuration otherConf)
-
Field Details
-
LOG
-
OUTPUT_TABLE
Job parameter that specifies the output table.- See Also:
-
WAL_ON
Property value to use write-ahead logging- See Also:
-
WAL_OFF
Property value to disable write-ahead logging- See Also:
-
WAL_PROPERTY
Set this toWAL_OFFto turn off write-ahead logging (WAL)- See Also:
-
OUTPUT_CLUSTER
Optional job parameter to specify a peer cluster. Used specifying remote cluster when copying between hbase clusters (the source is picked up fromhbase-site.xml). -
OUTPUT_COMMITTER_CLASS
The configuration key for specifying a customOutputCommitterimplementation to be used byTableOutputFormat. The value for this property should be the fully qualified class name of the custom committer. If this property is not set,TableOutputCommitterwill be used by default.- See Also:
-
OUTPUT_CONF_PREFIX
Deprecated.Since 3.0.0, will be removed in 4.0.0. You do not need to use this way for specifying configurations any more, you can specify any configuration with the connection uri's queries specified by theOUTPUT_CLUSTERparameter.Prefix for configuration property overrides to apply insetConf(Configuration). For keys matching this prefix, the prefix is stripped, and the value is set in the configuration with the resulting key, ie. the entry "hbase.mapred.output.key1 = value1" would be set in the configuration as "key1 = value1". Use this to set properties which should only be applied to theTableOutputFormatconfiguration and not the input configuration.- See Also:
-
QUORUM_ADDRESS
Deprecated.Since 3.0.0, will be removed in 4.0.0. UseOUTPUT_CLUSTERto specify the peer cluster instead.Optional job parameter to specify a peer cluster. Used specifying remote cluster when copying between hbase clusters (the source is picked up fromhbase-site.xml). -
QUORUM_PORT
Deprecated.Since 3.0.0, will be removed in 4.0.0. You do not need to use this way for specifying configurations any more, you can specify any configuration with the connection uri's queries specified by theOUTPUT_CLUSTERparameter.Optional job parameter to specify peer cluster's ZK client port.- See Also:
-
REGION_SERVER_CLASS
Deprecated.Since 2.5.9, 2.6.1 and 2.7.0, will be removed in 4.0.0. Does not take effect from long ago, see HBASE-6044.Optional specification of the rs class name of the peer cluster.- See Also:
-
REGION_SERVER_IMPL
Deprecated.Since 2.5.9, 2.6.1 and 2.7.0, will be removed in 4.0.0. Does not take effect from long ago, see HBASE-6044.Optional specification of the rs impl name of the peer cluster- See Also:
-
conf
The configuration.
-
-
Constructor Details
-
TableOutputFormat
public TableOutputFormat()
-
-
Method Details
-
createConnection
private static Connection createConnection(org.apache.hadoop.conf.Configuration conf) throws IOException - Throws:
IOException
-
getRecordWriter
public org.apache.hadoop.mapreduce.RecordWriter<KEY,Mutation> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException Creates a new record writer. Be aware that the baseline javadoc gives the impression that there is a singleRecordWriterper job but in HBase, it is more natural if we give you a new RecordWriter per call of this method. You must close the returned RecordWriter when done. Failure to do so will drop writes.- Specified by:
getRecordWriterin classorg.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation> - Parameters:
context- The current task context.- Returns:
- The newly created writer instance.
- Throws:
IOException- When creating the writer fails.InterruptedException- When the job is cancelled.
-
checkOutputSpecs
public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context) throws IOException, InterruptedException Checks if the output table exists and is enabled.- Specified by:
checkOutputSpecsin classorg.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation> - Parameters:
context- The current context.- Throws:
IOException- When the check fails.InterruptedException- When the job is aborted.- See Also:
-
OutputFormat.checkOutputSpecs(JobContext)
-
getOutputCommitter
public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException Returns the output committer.- Specified by:
getOutputCommitterin classorg.apache.hadoop.mapreduce.OutputFormat<KEY,Mutation> - Parameters:
context- The current context.- Returns:
- The committer.
- Throws:
IOException- When creating the committer fails.InterruptedException- When the job is aborted.- See Also:
-
OutputFormat.getOutputCommitter(TaskAttemptContext)
-
getConf
- Specified by:
getConfin interfaceorg.apache.hadoop.conf.Configurable
-
setConf
- Specified by:
setConfin interfaceorg.apache.hadoop.conf.Configurable
-