java.lang.Object

org.apache.hadoop.mapreduce.InputFormat<K,V>

org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>

org.apache.hadoop.hbase.mapreduce.HFileInputFormat

@Private public class HFileInputFormat extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>

Simple MR input format for HFiles. This code was borrowed from Apache Crunch project. Updated to the recent version of HBase.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

private static class

HFileInputFormat.HFileRecordReader

Record reader for HFiles.

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
Field Summary

Fields

Modifier and Type

Field

Description

(package private) static final org.apache.hadoop.fs.PathFilter

HIDDEN_FILE_FILTER

File filter that removes all "hidden" files.

private static final org.slf4j.Logger

LOG

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
Constructor Summary

Constructors

Constructor

Description

HFileInputFormat()
Method Summary

Modifier and Type

Method

Description

private static void

addFilesRecursively(org.apache.hadoop.mapreduce.JobContext job, org.apache.hadoop.fs.FileStatus status, List<org.apache.hadoop.fs.FileStatus> result)

org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,Cell>

createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)

protected boolean

isSplitable(org.apache.hadoop.mapreduce.JobContext context, org.apache.hadoop.fs.Path filename)

protected List<org.apache.hadoop.fs.FileStatus>

listStatus(org.apache.hadoop.mapreduce.JobContext job)

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize, shrinkStatus

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- LOG
  
  private static final org.slf4j.Logger LOG
- HIDDEN_FILE_FILTER
  
  static final org.apache.hadoop.fs.PathFilter HIDDEN_FILE_FILTER
  
  File filter that removes all "hidden" files. This might be something worth removing from a more general purpose utility; it accounts for the presence of metadata files created in the way we're doing exports.
Constructor Details
- HFileInputFormat
  
  public HFileInputFormat()
Method Details
- listStatus
  
  protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job) throws IOException
  
  Overrides:
  
  listStatus in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>
  
  Throws:
  
  IOException
- addFilesRecursively
  
  private static void addFilesRecursively(org.apache.hadoop.mapreduce.JobContext job, org.apache.hadoop.fs.FileStatus status, List<org.apache.hadoop.fs.FileStatus> result) throws IOException
  
  Throws:
  
  IOException
- createRecordReader
  
  public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,Cell> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException
  
  Specified by:
  
  createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.NullWritable,Cell>
  
  Throws:
  
  IOException
  
  InterruptedException
- isSplitable
  
  protected boolean isSplitable(org.apache.hadoop.mapreduce.JobContext context, org.apache.hadoop.fs.Path filename)
  
  Overrides:
  
  isSplitable in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>

Class HFileInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Field Details

LOG

HIDDEN_FILE_FILTER

Constructor Details

HFileInputFormat

Method Details

listStatus

addFilesRecursively

createRecordReader

isSplitable