Class MultiTableHFileOutputFormat
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<ImmutableBytesWritable,Cell>
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
org.apache.hadoop.hbase.mapreduce.MultiTableHFileOutputFormat
Create 3 level tree directory, first level is using table name as parent directory and then use
family name as child directory, and all related HFiles for one family are under child directory
-tableName1 -columnFamilyName1 -columnFamilyName2 -HFiles -tableName2 -columnFamilyName1 -HFiles
-columnFamilyName2
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
HFileOutputFormat2.TableInfo, HFileOutputFormat2.WriterLengthNested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter -
Field Summary
FieldsFields inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
BLOCK_SIZE_FAMILIES_CONF_KEY, blockSizeDetails, BLOOM_PARAM_FAMILIES_CONF_KEY, BLOOM_TYPE_FAMILIES_CONF_KEY, bloomParamDetails, bloomTypeDetails, COMPRESSION_FAMILIES_CONF_KEY, COMPRESSION_OVERRIDE_CONF_KEY, compressionDetails, DATABLOCK_ENCODING_FAMILIES_CONF_KEY, DATABLOCK_ENCODING_OVERRIDE_CONF_KEY, dataBlockEncodingDetails, DISK_BASED_SORTING_ENABLED_KEY, EXTENDED_CELL_SERIALIZATION_ENABLED_DEFULT, EXTENDED_CELL_SERIALIZATION_ENABLED_KEY, LOCALITY_SENSITIVE_CONF_KEY, MULTI_TABLE_HFILEOUTPUTFORMAT_CONF_KEY, OUTPUT_TABLE_NAME_CONF_KEY, REMOTE_CLUSTER_CONF_PREFIX, REMOTE_CLUSTER_ZOOKEEPER_CLIENT_PORT_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_QUORUM_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_ZNODE_PARENT_CONF_KEY, STORAGE_POLICY_PROPERTY, STORAGE_POLICY_PROPERTY_CF_PREFIX, tableSeparatorFields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidconfigureIncrementalLoad(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors) Analogous toHFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator), this function will configure the requisite number of reducers to write HFiles for multple tables simultaneouslystatic byte[]createCompositeKey(byte[] tableName, byte[] suffix) Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest jobstatic byte[]createCompositeKey(byte[] tableName, ImmutableBytesWritable suffix) Alternate api which accepts an ImmutableBytesWritable for the suffixstatic byte[]createCompositeKey(String tableName, ImmutableBytesWritable suffix) Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffixprotected static byte[]getSuffix(byte[] keyBytes) protected static byte[]getTableName(byte[] keyBytes) private static final intvalidateCompositeKey(byte[] keyBytes) Methods inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
combineTableNameSuffix, configureForRemoteCluster, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoadMap, configurePartitioner, configureRemoteCluster, configureStoragePolicy, createFamilyBlockSizeMap, createFamilyBloomParamMap, createFamilyBloomTypeMap, createFamilyCompressionMap, createFamilyDataBlockEncodingMap, createRecordWriter, diskBasedSortingEnabled, getRecordWriter, getTableNameSuffixedWithFamily, serializeColumnFamilyAttributeMethods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath
-
Field Details
-
LOG
-
-
Constructor Details
-
MultiTableHFileOutputFormat
public MultiTableHFileOutputFormat()
-
-
Method Details
-
createCompositeKey
Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest job- Parameters:
tableName- Name of the Table - Eg: TableName.getNameAsString()suffix- Usually represents a rowkey when creating a mapper key or column family- Returns:
- byte[] representation of composite key
-
createCompositeKey
Alternate api which accepts an ImmutableBytesWritable for the suffix- See Also:
-
createCompositeKey
Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffix- See Also:
-
configureIncrementalLoad
public static void configureIncrementalLoad(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors) throws IOException Analogous toHFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator), this function will configure the requisite number of reducers to write HFiles for multple tables simultaneously- Parameters:
job- SeeJobmultiTableDescriptors- Table descriptor and region locator pairs- Throws:
IOException
-
validateCompositeKey
-
getTableName
-
getSuffix
-