org.apache.hadoop.mapreduce.OutputFormat<K,V>

org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<ImmutableBytesWritable,Cell>

org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2

org.apache.hadoop.hbase.mapreduce.MultiTableHFileOutputFormat

@Public public class MultiTableHFileOutputFormat extends HFileOutputFormat2

Create 3 level tree directory, first level is using table name as parent directory and then use family name as child directory, and all related HFiles for one family are under child directory -tableName1 -columnFamilyName1 -columnFamilyName2 -HFiles -tableName2 -columnFamilyName1 -HFiles -columnFamilyName2

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
HFileOutputFormat2.TableInfo, HFileOutputFormat2.WriterLength

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
Field Summary

Fields

Modifier and Type

Field

Description

private static final org.slf4j.Logger

LOG

Fields inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
BLOCK_SIZE_FAMILIES_CONF_KEY, blockSizeDetails, BLOOM_PARAM_FAMILIES_CONF_KEY, BLOOM_TYPE_FAMILIES_CONF_KEY, bloomParamDetails, bloomTypeDetails, COMPRESSION_FAMILIES_CONF_KEY, COMPRESSION_OVERRIDE_CONF_KEY, compressionDetails, DATABLOCK_ENCODING_FAMILIES_CONF_KEY, DATABLOCK_ENCODING_OVERRIDE_CONF_KEY, dataBlockEncodingDetails, EXTENDED_CELL_SERIALIZATION_ENABLED_DEFULT, EXTENDED_CELL_SERIALIZATION_ENABLED_KEY, LOCALITY_SENSITIVE_CONF_KEY, MULTI_TABLE_HFILEOUTPUTFORMAT_CONF_KEY, OUTPUT_TABLE_NAME_CONF_KEY, REMOTE_CLUSTER_CONF_PREFIX, REMOTE_CLUSTER_ZOOKEEPER_CLIENT_PORT_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_QUORUM_CONF_KEY, REMOTE_CLUSTER_ZOOKEEPER_ZNODE_PARENT_CONF_KEY, STORAGE_POLICY_PROPERTY, STORAGE_POLICY_PROPERTY_CF_PREFIX, tableSeparator

Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART
Constructor Summary

Constructors

Constructor

Description

MultiTableHFileOutputFormat()
Method Summary

Modifier and Type

Method

Description

static void

configureIncrementalLoad(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors)

Analogous to HFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator), this function will configure the requisite number of reducers to write HFiles for multple tables simultaneously

static byte[]

createCompositeKey(byte[] tableName, byte[] suffix)

Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest job

static byte[]

createCompositeKey(byte[] tableName, ImmutableBytesWritable suffix)

Alternate api which accepts an ImmutableBytesWritable for the suffix

static byte[]

createCompositeKey(String tableName, ImmutableBytesWritable suffix)

Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffix

protected static byte[]

getSuffix(byte[] keyBytes)

protected static byte[]

getTableName(byte[] keyBytes)

private static final int

validateCompositeKey(byte[] keyBytes)

Methods inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2
combineTableNameSuffix, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoad, configureIncrementalLoadMap, configurePartitioner, configureRemoteCluster, configureStoragePolicy, createFamilyBlockSizeMap, createFamilyBloomParamMap, createFamilyBloomTypeMap, createFamilyCompressionMap, createFamilyDataBlockEncodingMap, createRecordWriter, getRecordWriter, getTableNameSuffixedWithFamily, serializeColumnFamilyAttribute

Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- LOG
  
  private static final org.slf4j.Logger LOG
Constructor Details
- MultiTableHFileOutputFormat
  
  public MultiTableHFileOutputFormat()
Method Details
- createCompositeKey
  
  public static byte[] createCompositeKey(byte[] tableName, byte[] suffix)
  
  Creates a composite key to use as a mapper output key when using MultiTableHFileOutputFormat.configureIncrementaLoad to set up bulk ingest job
  
  Parameters:
  
  tableName - Name of the Table - Eg: TableName.getNameAsString()
  
  suffix - Usually represents a rowkey when creating a mapper key or column family
  
  Returns:
  
  byte[] representation of composite key
- createCompositeKey
  
  public static byte[] createCompositeKey(byte[] tableName, ImmutableBytesWritable suffix)
  
  Alternate api which accepts an ImmutableBytesWritable for the suffix
  See Also:
  
  createCompositeKey(byte[], byte[])
- createCompositeKey
  
  public static byte[] createCompositeKey(String tableName, ImmutableBytesWritable suffix)
  
  Alternate api which accepts a String for the tableName and ImmutableBytesWritable for the suffix
  See Also:
  
  createCompositeKey(byte[], byte[])
- configureIncrementalLoad
  
  public static void configureIncrementalLoad(org.apache.hadoop.mapreduce.Job job, List<HFileOutputFormat2.TableInfo> multiTableDescriptors) throws IOException
  
  Analogous to HFileOutputFormat2.configureIncrementalLoad(Job, TableDescriptor, RegionLocator), this function will configure the requisite number of reducers to write HFiles for multple tables simultaneously
  
  Parameters:
  
  job - See Job
  
  multiTableDescriptors - Table descriptor and region locator pairs
  
  Throws:
  
  IOException
- validateCompositeKey
  
  private static final int validateCompositeKey(byte[] keyBytes)
- getTableName
  
  protected static byte[] getTableName(byte[] keyBytes)
- getSuffix
  
  protected static byte[] getSuffix(byte[] keyBytes)

Class MultiTableHFileOutputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Field Summary

Fields inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2

Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2

Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Methods inherited from class java.lang.Object

Field Details

LOG

Constructor Details

MultiTableHFileOutputFormat

Method Details

createCompositeKey

createCompositeKey

createCompositeKey

configureIncrementalLoad

validateCompositeKey

getTableName

getSuffix