Class HFileInputFormat
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>
org.apache.hadoop.hbase.mapreduce.HFileInputFormat
@Private
public class HFileInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,Cell>
Simple MR input format for HFiles. This code was borrowed from Apache Crunch project. Updated to
the recent version of HBase.
-
Nested Class Summary
Modifier and TypeClassDescriptionprivate static class
Record reader for HFiles.Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
-
Field Summary
Modifier and TypeFieldDescription(package private) static final org.apache.hadoop.fs.PathFilter
File filter that removes all "hidden" files.private static final org.slf4j.Logger
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprivate static void
addFilesRecursively
(org.apache.hadoop.mapreduce.JobContext job, org.apache.hadoop.fs.FileStatus status, List<org.apache.hadoop.fs.FileStatus> result) org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,
Cell> createRecordReader
(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) protected boolean
isSplitable
(org.apache.hadoop.mapreduce.JobContext context, org.apache.hadoop.fs.Path filename) protected List<org.apache.hadoop.fs.FileStatus>
listStatus
(org.apache.hadoop.mapreduce.JobContext job) Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize, shrinkStatus
-
Field Details
-
LOG
-
HIDDEN_FILE_FILTER
File filter that removes all "hidden" files. This might be something worth removing from a more general purpose utility; it accounts for the presence of metadata files created in the way we're doing exports.
-
-
Constructor Details
-
HFileInputFormat
public HFileInputFormat()
-
-
Method Details
-
listStatus
protected List<org.apache.hadoop.fs.FileStatus> listStatus(org.apache.hadoop.mapreduce.JobContext job) throws IOException - Overrides:
listStatus
in classorg.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,
Cell> - Throws:
IOException
-
addFilesRecursively
private static void addFilesRecursively(org.apache.hadoop.mapreduce.JobContext job, org.apache.hadoop.fs.FileStatus status, List<org.apache.hadoop.fs.FileStatus> result) throws IOException - Throws:
IOException
-
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.NullWritable,Cell> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException - Specified by:
createRecordReader
in classorg.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.NullWritable,
Cell> - Throws:
IOException
InterruptedException
-
isSplitable
protected boolean isSplitable(org.apache.hadoop.mapreduce.JobContext context, org.apache.hadoop.fs.Path filename) - Overrides:
isSplitable
in classorg.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.NullWritable,
Cell>
-