Class WARCOutputFormat

java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.hadoop.io.NullWritable,WARCWritable>
org.apache.hadoop.hbase.test.util.warc.WARCOutputFormat

public class WARCOutputFormat extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.hadoop.io.NullWritable,WARCWritable>
Hadoop OutputFormat for mapreduce jobs ('new' API) that want to write data to WARC files. Usage: ```java Job job = new Job(getConf()); job.setOutputFormatClass(WARCOutputFormat.class); job.setOutputKeyClass(NullWritable.class); job.setOutputValueClass(WARCWritable.class); FileOutputFormat.setCompressOutput(job, true); ``` The tasks generating the output (usually the reducers, but may be the mappers if there are no reducers) should use `NullWritable.get()` as the output key, and the WARCWritable as the output value.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    private class 
     

    Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

    org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter
  • Field Summary

    Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

    BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.hadoop.mapreduce.RecordWriter<org.apache.hadoop.io.NullWritable,WARCWritable>
    getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
    Creates a new output file in WARC format, and returns a RecordWriter for writing to it.

    Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

    checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait