org.apache.hadoop.hbase.mapreduce (Apache HBase 4.0.0-alpha-1-SNAPSHOT API)

package org.apache.hadoop.hbase.mapreduce

Provides HBase MapReduce Input/OutputFormats, a table indexing MapReduce job, and utility methods.

See HBase and MapReduce in the HBase Reference Guide for mapreduce over hbase documentation.

Related Packages

Package

Description

org.apache.hadoop.hbase
Class

Description

org.apache.hadoop.hbase.mapreduce.CellCounter

A job with a a map and reduce phase to count cells in a table.

org.apache.hadoop.hbase.mapreduce.CellCreator

Facade to create Cells for HFileOutputFormat.

org.apache.hadoop.hbase.mapreduce.CellSerialization

org.apache.hadoop.hbase.mapreduce.CellSerialization.CellDeserializer

org.apache.hadoop.hbase.mapreduce.CellSerialization.CellSerializer

org.apache.hadoop.hbase.mapreduce.CellSortReducer

Emits sorted Cells.

org.apache.hadoop.hbase.mapreduce.CopyTable

Tool used to copy a table to another one which can be on a different setup.

org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver

This implementation creates tags by expanding expression using label ordinal.

org.apache.hadoop.hbase.mapreduce.Driver

Driver for hbase mapreduce jobs.

org.apache.hadoop.hbase.mapreduce.Export

Export an HBase table.

org.apache.hadoop.hbase.mapreduce.ExportUtils

Some helper methods are used by Export and org.apache.hadoop.hbase.coprocessor.Export (in hbase-endpooint).

org.apache.hadoop.hbase.mapreduce.ExtendedCellSerialization

Similar to CellSerialization, but includes the sequenceId from an ExtendedCell.

org.apache.hadoop.hbase.mapreduce.ExtendedCellSerialization.ExtendedCellDeserializer

org.apache.hadoop.hbase.mapreduce.ExtendedCellSerialization.ExtendedCellSerializer

org.apache.hadoop.hbase.mapreduce.GroupingTableMapper

Extract grouping columns from input record.

org.apache.hadoop.hbase.mapreduce.HashTable

org.apache.hadoop.hbase.mapreduce.HashTable.HashMapper

org.apache.hadoop.hbase.mapreduce.HashTable.TableHash

org.apache.hadoop.hbase.mapreduce.HFileInputFormat

Simple MR input format for HFiles.

org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2

Writes HFiles.

org.apache.hadoop.hbase.mapreduce.HRegionPartitioner<KEY,VALUE>

This is used to partition the output keys into groups of keys.

org.apache.hadoop.hbase.mapreduce.IdentityTableMapper

Pass the given key and record as-is to the reduce phase.

org.apache.hadoop.hbase.mapreduce.IdentityTableReducer

Convenience class that simply writes all values (which must be Put or Delete instances) passed to it out to the configured HBase table.

org.apache.hadoop.hbase.mapreduce.Import

Import data written by Export.

org.apache.hadoop.hbase.mapreduce.Import.CellImporter

A mapper that just writes out KeyValues.

org.apache.hadoop.hbase.mapreduce.Import.CellReducer

org.apache.hadoop.hbase.mapreduce.Import.CellSortImporter

org.apache.hadoop.hbase.mapreduce.Import.CellWritableComparable

org.apache.hadoop.hbase.mapreduce.Import.CellWritableComparable.CellWritableComparator

org.apache.hadoop.hbase.mapreduce.Import.CellWritableComparablePartitioner

org.apache.hadoop.hbase.mapreduce.Import.Importer

Write table content out to files in hdfs.

org.apache.hadoop.hbase.mapreduce.ImportTsv

Tool to import data from a TSV file.

org.apache.hadoop.hbase.mapreduce.ImportTsv.TsvParser

org.apache.hadoop.hbase.mapreduce.ImportTsv.TsvParser.BadTsvLineException

org.apache.hadoop.hbase.mapreduce.IndexBuilder

Example map/reduce job to construct index tables that can be used to quickly find a row based on the value of a column.

org.apache.hadoop.hbase.mapreduce.IndexBuilder.Map

Internal Mapper to be run by Hadoop.

org.apache.hadoop.hbase.mapreduce.JarFinder

Finds the Jar for a class.

org.apache.hadoop.hbase.mapreduce.JobUtil

Utility methods to interact with a job.

org.apache.hadoop.hbase.mapreduce.KeyOnlyCellComparable

org.apache.hadoop.hbase.mapreduce.KeyOnlyCellComparable.KeyOnlyCellComparator

org.apache.hadoop.hbase.mapreduce.MultiTableHFileOutputFormat

Create 3 level tree directory, first level is using table name as parent directory and then use family name as child directory, and all related HFiles for one family are under child directory -tableName1 -columnFamilyName1 -columnFamilyName2 -HFiles -tableName2 -columnFamilyName1 -HFiles -columnFamilyName2

org.apache.hadoop.hbase.mapreduce.MultiTableInputFormat

Convert HBase tabular data from multiple scanners into a format that is consumable by Map/Reduce.

org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase

A base for MultiTableInputFormats.

org.apache.hadoop.hbase.mapreduce.MultiTableOutputFormat

Hadoop output format that writes to one or more HBase tables.

org.apache.hadoop.hbase.mapreduce.MultiTableOutputFormat.MultiTableRecordWriter

Record writer for outputting to multiple HTables.

org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormat

MultiTableSnapshotInputFormat generalizes TableSnapshotInputFormat allowing a MapReduce job to run over one or more table snapshots, with one or more scans configured for each.

org.apache.hadoop.hbase.mapreduce.MultiTableSnapshotInputFormatImpl

Shared implementation of mapreduce code over multiple table snapshots.

org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper<K2,V2>

Multithreaded implementation for @link org.apache.hbase.mapreduce.TableMapper

org.apache.hadoop.hbase.mapreduce.MutationSerialization

org.apache.hadoop.hbase.mapreduce.PreSortedCellsReducer

org.apache.hadoop.hbase.mapreduce.PutCombiner<K>

Combine Puts.

org.apache.hadoop.hbase.mapreduce.PutSortReducer

Emits sorted Puts.

org.apache.hadoop.hbase.mapreduce.RegionSizeCalculator

Computes size of each region for given table and given column families.

org.apache.hadoop.hbase.mapreduce.ResultSerialization

org.apache.hadoop.hbase.mapreduce.RoundRobinTableInputFormat

Process the return from super-class TableInputFormat (TIF) so as to undo any clumping of InputSplits around RegionServers.

org.apache.hadoop.hbase.mapreduce.RowCounter

A job with a just a map phase to count rows.

org.apache.hadoop.hbase.mapreduce.SampleUploader

Sample Uploader MapReduce

org.apache.hadoop.hbase.mapreduce.SimpleTotalOrderPartitioner<VALUE>

A partitioner that takes start and end keys and uses bigdecimal to figure which reduce a key belongs to.

org.apache.hadoop.hbase.mapreduce.SyncTable

org.apache.hadoop.hbase.mapreduce.SyncTable.SyncMapper

org.apache.hadoop.hbase.mapreduce.SyncTable.SyncMapper.Counter

org.apache.hadoop.hbase.mapreduce.TableInputFormat

Convert HBase tabular data into a format that is consumable by Map/Reduce.

org.apache.hadoop.hbase.mapreduce.TableInputFormatBase

A base for TableInputFormats.

org.apache.hadoop.hbase.mapreduce.TableMapper<KEYOUT,VALUEOUT>

Extends the base Mapper class to add the required input key and value classes.

org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil

Utility for TableMapper and TableReducer

org.apache.hadoop.hbase.mapreduce.TableOutputCommitter

Small committer class that does not do anything.

org.apache.hadoop.hbase.mapreduce.TableOutputFormat<KEY>

Convert Map/Reduce output and write it to an HBase table.

org.apache.hadoop.hbase.mapreduce.TableRecordReader

Iterate over an HBase table data, return (ImmutableBytesWritable, Result) pairs.

org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl

Iterate over an HBase table data, return (ImmutableBytesWritable, Result) pairs.

org.apache.hadoop.hbase.mapreduce.TableReducer<KEYIN,VALUEIN,KEYOUT>

Extends the basic Reducer class to add the required key and value input/output classes.

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat

TableSnapshotInputFormat allows a MapReduce job to run over a table snapshot.

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.TableSnapshotRegionRecordReader

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.TableSnapshotRegionSplit

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl

Hadoop MR API-agnostic implementation for mapreduce over table snapshots.

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl.InputSplit

Implementation class for InputSplit logic common between mapred and mapreduce.

org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl.RecordReader

Implementation class for RecordReader logic common between mapred and mapreduce.

org.apache.hadoop.hbase.mapreduce.TableSplit

A table split corresponds to a key range (low, high) and an optional scanner.

org.apache.hadoop.hbase.mapreduce.TextSortReducer

Emits Sorted KeyValues.

org.apache.hadoop.hbase.mapreduce.TsvImporterMapper

Write table content out to files in hdfs.

org.apache.hadoop.hbase.mapreduce.TsvImporterTextMapper

Write table content out to map output files.

org.apache.hadoop.hbase.mapreduce.VisibilityExpressionResolver

Interface to convert visibility expressions into Tags for storing along with Cells in HFiles.

org.apache.hadoop.hbase.mapreduce.WALInputFormat

Simple InputFormat for WAL files.

org.apache.hadoop.hbase.mapreduce.WALPlayer

A tool to replay WAL files as a M/R job.

org.apache.hadoop.hbase.mapreduce.WALPlayer.Counter

Enum for map metrics.

org.apache.hadoop.hbase.mapreduce.WALPlayer.WALMapper

A mapper that writes out Mutation to be directly applied to a running HBase instance.

Package org.apache.hadoop.hbase.mapreduce