Class MultiTableHFileOutputFormat
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<ImmutableBytesWritable,Cell>
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
org.apache.hadoop.hbase.mapreduce.MultiTableHFileOutputFormat
Create 3 level tree directory, first level is using table name as parent directory and then use
family name as child directory, and all related HFiles for one family are under child directory
-tableName1 -columnFamilyName1 -columnFamilyName2 -HFiles -tableName2 -columnFamilyName1 -HFiles
-columnFamilyName2
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
HFileOutputFormat2.TableInfo, HFileOutputFormat2.WriterLength
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
-
Field Summary
Fields inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
BLOCK_SIZE_FAMILIES_CONF_KEY, blockSizeDetails, BLOOM_PARAM_FAMILIES_CONF_KEY, BLOOM_TYPE_FAMILIES_CONF_KEY, bloomParamDetails, bloomTypeDetails, COMPRESSION_FAMILIES_CONF_KEY, COMPRESSION_OVERRIDE_CONF_KEY, compressionDetails, DATABLOCK_ENCODING_FAMILIES_CONF_KEY, DATABLOCK_ENCODING_OVERRIDE_CONF_KEY, dataBlockEncodingDetails, EXTENDED_CELL_SERIALIZATION_ENABLED_DEFULT, EXTENDED_CELL_SERIALIZATION_ENABLED_KEY, LOCALITY_SENSITIVE_CONF_KEY, MULTI_TABLE_HFILEOUTPUTFORMAT_CONF_KEY, OUTPUT_TABLE_NAME_CONF_KEY, REMOTE_CLUSTER_CONF_PREFIX, REMOTE_CLUSTER_ZOOKEEPER_CLIENT_PORT_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_QUORUM_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_ZNODE_PARENT_CONF_KEY, STORAGE_POLICY_PROPERTY, STORAGE_POLICY_PROPERTY_CF_PREFIX, tableSeparator
Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
configureIncrementalLoad
(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors) Analogous toHFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator)
, this function will configure the requisite number of reducers to write HFiles for multple tables simultaneouslystatic byte[]
createCompositeKey
(byte[] tableName, byte[] suffix) Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest jobstatic byte[]
createCompositeKey
(byte[] tableName, ImmutableBytesWritable suffix) Alternate api which accepts an ImmutableBytesWritable for the suffixstatic byte[]
createCompositeKey
(String tableName, ImmutableBytesWritable suffix) Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffixprotected static byte[]
getSuffix
(byte[] keyBytes) protected static byte[]
getTableName
(byte[] keyBytes) private static final int
validateCompositeKey
(byte[] keyBytes) Methods inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
combineTableNameSuffix, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoadMap, configurePartitioner, configureRemoteCluster, configureStoragePolicy, createFamilyBlockSizeMap, createFamilyBloomParamMap, createFamilyBloomTypeMap, createFamilyCompressionMap, createFamilyDataBlockEncodingMap, createRecordWriter, getRecordWriter, getTableNameSuffixedWithFamily, serializeColumnFamilyAttribute
Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath
-
Field Details
-
LOG
-
-
Constructor Details
-
MultiTableHFileOutputFormat
public MultiTableHFileOutputFormat()
-
-
Method Details
-
createCompositeKey
Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest job- Parameters:
tableName
- Name of the Table - Eg: TableName.getNameAsString()suffix
- Usually represents a rowkey when creating a mapper key or column family- Returns:
- byte[] representation of composite key
-
createCompositeKey
Alternate api which accepts an ImmutableBytesWritable for the suffix- See Also:
-
createCompositeKey
Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffix- See Also:
-
configureIncrementalLoad
public static void configureIncrementalLoad(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors) throws IOException Analogous toHFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator)
, this function will configure the requisite number of reducers to write HFiles for multple tables simultaneously- Parameters:
job
- SeeJob
multiTableDescriptors
- Table descriptor and region locator pairs- Throws:
IOException
-
validateCompositeKey
-
getTableName
-
getSuffix
-