Class WALInputFormat
Simple
InputFormat
for WAL
files.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
handler for non-deprecated WALKey version.(package private) static class
WALInputFormat.WALRecordReader<K extends WALKey>
RecordReader
for anWAL
file.(package private) static class
InputSplit
forWAL
files. -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) static void
addFile
(List<org.apache.hadoop.fs.FileStatus> result, org.apache.hadoop.fs.LocatedFileStatus lfs, long startTime, long endTime) createRecordReader
(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) private List<org.apache.hadoop.fs.FileStatus>
getFiles
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, long startTime, long endTime, org.apache.hadoop.conf.Configuration conf) private org.apache.hadoop.fs.Path[]
getInputPaths
(org.apache.hadoop.conf.Configuration conf) List<org.apache.hadoop.mapreduce.InputSplit>
getSplits
(org.apache.hadoop.mapreduce.JobContext context) (package private) List<org.apache.hadoop.mapreduce.InputSplit>
implementation shared with deprecated HLogInputFormatprivate static org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus>
listLocatedFileStatus
(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.conf.Configuration conf) Attempts to return theLocatedFileStatus
for the given directory.
-
Field Details
-
LOG
-
START_TIME_KEY
- See Also:
-
END_TIME_KEY
- See Also:
-
-
Constructor Details
-
WALInputFormat
public WALInputFormat()
-
-
Method Details
-
getSplits
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException, InterruptedException - Specified by:
getSplits
in classorg.apache.hadoop.mapreduce.InputFormat<WALKey,
WALEdit> - Throws:
IOException
InterruptedException
-
getSplits
List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context, String startKey, String endKey) throws IOException, InterruptedException implementation shared with deprecated HLogInputFormat- Throws:
IOException
InterruptedException
-
getInputPaths
-
getFiles
private List<org.apache.hadoop.fs.FileStatus> getFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, long startTime, long endTime, org.apache.hadoop.conf.Configuration conf) throws IOException - Parameters:
startTime
- If file looks like it has a timestamp in its name, we'll check if newer or equal to this value else we will filter out the file. If name does not seem to have a timestamp, we will just return it w/o filtering.endTime
- If file looks like it has a timestamp in its name, we'll check if older or equal to this value else we will filter out the file. If name does not seem to have a timestamp, we will just return it w/o filtering.- Throws:
IOException
-
addFile
-
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<WALKey,WALEdit> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context) throws IOException, InterruptedException - Specified by:
createRecordReader
in classorg.apache.hadoop.mapreduce.InputFormat<WALKey,
WALEdit> - Throws:
IOException
InterruptedException
-
listLocatedFileStatus
private static org.apache.hadoop.fs.RemoteIterator<org.apache.hadoop.fs.LocatedFileStatus> listLocatedFileStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.conf.Configuration conf) throws IOException Attempts to return theLocatedFileStatus
for the given directory. If the directory does not exist, it will check if the directory is an archived log file and try to find it- Throws:
IOException
-