Class BulkDataGeneratorTool

java.lang.Object
org.apache.hadoop.hbase.util.bulkdatagenerator.BulkDataGeneratorTool

public class BulkDataGeneratorTool extends Object
A command line utility to generate pre-splitted HBase Tables with large amount (TBs) of random data, equally distributed among all regions.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private boolean
    Flag to delete the table (before creating) if it already exists
    private static final org.slf4j.Logger
     
    private int
    Number of mapper container to be launched for generating of HFiles
    private static final String
    Prefix for the generated HFiles directory
    private long
    Number of rows to be generated by each mapper
    private int
    Number of splits for the table.
    private String
    Table for which random data needs to be generated
    private final Map<String,String>
    Additional HBase meta-data options to be set for the table
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected org.apache.hadoop.mapreduce.Job
    createSubmittableJob(org.apache.hadoop.conf.Configuration conf)
     
    protected org.apache.hadoop.fs.Path
    Returns Random output directory path where HFiles will be generated
    protected org.apache.hbase.thirdparty.org.apache.commons.cli.Options
    Returns the command line option for BulkDataGeneratorTool
    static void
    main(String[] args)
     
    private void
    parseTableOptions(org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line)
     
    protected void
     
    protected void
    readCommandLineParameters(org.apache.hadoop.conf.Configuration conf, org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line)
    This method parses the command line parameters into instance variables
    boolean
    run(org.apache.hadoop.conf.Configuration conf, String[] args)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • logger

      private static final org.slf4j.Logger logger
    • OUTPUT_DIRECTORY_PREFIX

      private static final String OUTPUT_DIRECTORY_PREFIX
      Prefix for the generated HFiles directory
      See Also:
    • mapperCount

      private int mapperCount
      Number of mapper container to be launched for generating of HFiles
    • rowsPerMapper

      private long rowsPerMapper
      Number of rows to be generated by each mapper
    • table

      private String table
      Table for which random data needs to be generated
    • splitCount

      private int splitCount
      Number of splits for the table. Number of regions for the table will be (splitCount + 1).
    • deleteTableIfExist

      private boolean deleteTableIfExist
      Flag to delete the table (before creating) if it already exists
    • tableOptions

      private final Map<String,String> tableOptions
      Additional HBase meta-data options to be set for the table
  • Constructor Details

  • Method Details

    • main

      public static void main(String[] args) throws Exception
      Throws:
      Exception
    • run

      public boolean run(org.apache.hadoop.conf.Configuration conf, String[] args) throws IOException
      Throws:
      IOException
    • createSubmittableJob

      protected org.apache.hadoop.mapreduce.Job createSubmittableJob(org.apache.hadoop.conf.Configuration conf) throws IOException
      Throws:
      IOException
    • generateOutputDirectory

      protected org.apache.hadoop.fs.Path generateOutputDirectory()
      Returns Random output directory path where HFiles will be generated
    • readCommandLineParameters

      protected void readCommandLineParameters(org.apache.hadoop.conf.Configuration conf, org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line) throws org.apache.hbase.thirdparty.org.apache.commons.cli.ParseException, IOException
      This method parses the command line parameters into instance variables
      Throws:
      org.apache.hbase.thirdparty.org.apache.commons.cli.ParseException
      IOException
    • parseTableOptions

      private void parseTableOptions(org.apache.hbase.thirdparty.org.apache.commons.cli.CommandLine line)
    • getOptions

      protected org.apache.hbase.thirdparty.org.apache.commons.cli.Options getOptions()
      Returns the command line option for BulkDataGeneratorTool
    • printUsage

      protected void printUsage()