Class BulkDataGeneratorTool
java.lang.Object
org.apache.hadoop.hbase.util.bulkdatagenerator.BulkDataGeneratorTool
A command line utility to generate pre-splitted HBase Tables with large amount (TBs) of random
data, equally distributed among all regions.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate booleanFlag to delete the table (before creating) if it already existsprivate static final org.slf4j.Loggerprivate intNumber of mapper container to be launched for generating of HFilesprivate static final StringPrefix for the generated HFiles directoryprivate longNumber of rows to be generated by each mapperprivate intNumber of splits for thetable.private StringTable for which random data needs to be generatedAdditional HBase meta-data options to be set for the table -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected org.apache.hadoop.mapreduce.JobcreateSubmittableJob(org.apache.hadoop.conf.Configuration conf) protected org.apache.hadoop.fs.PathReturns Random output directory path where HFiles will be generatedprotected org.apache.hbase.thirdparty.org.apache.commons.cli.OptionsReturns the command line option forBulkDataGeneratorToolstatic voidprivate voidparseTableOptions(org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line) protected voidprotected voidreadCommandLineParameters(org.apache.hadoop.conf.Configuration conf, org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line) This method parses the command line parameters into instance variablesboolean
-
Field Details
-
logger
-
OUTPUT_DIRECTORY_PREFIX
Prefix for the generated HFiles directory- See Also:
-
mapperCount
Number of mapper container to be launched for generating of HFiles -
rowsPerMapper
Number of rows to be generated by each mapper -
table
Table for which random data needs to be generated -
splitCount
Number of splits for thetable. Number of regions for the table will be (splitCount+ 1). -
deleteTableIfExist
Flag to delete the table (before creating) if it already exists -
tableOptions
Additional HBase meta-data options to be set for the table
-
-
Constructor Details
-
BulkDataGeneratorTool
public BulkDataGeneratorTool()
-
-
Method Details
-
main
- Throws:
Exception
-
run
- Throws:
IOException
-
createSubmittableJob
protected org.apache.hadoop.mapreduce.Job createSubmittableJob(org.apache.hadoop.conf.Configuration conf) throws IOException - Throws:
IOException
-
generateOutputDirectory
Returns Random output directory path where HFiles will be generated -
readCommandLineParameters
protected void readCommandLineParameters(org.apache.hadoop.conf.Configuration conf, org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line) throws org.apache.hbase.thirdparty.org.apache.commons.cli.ParseException, IOException This method parses the command line parameters into instance variables- Throws:
org.apache.hbase.thirdparty.org.apache.commons.cli.ParseExceptionIOException
-
parseTableOptions
-
getOptions
Returns the command line option forBulkDataGeneratorTool -
printUsage
-