org.apache.hadoop.hbase.mapreduce (Apache HBase 4.0.0-alpha-1-SNAPSHOT API)

package org.apache.hadoop.hbase.mapreduce

Provides HBase MapReduce Input/OutputFormats, a table indexing MapReduce job, and utility methods.

See HBase and MapReduce in the HBase Reference Guide for mapreduce over hbase documentation.

Related Packages

Package

Description

org.apache.hadoop.hbase

org.apache.hadoop.hbase.mapreduce.replication
Class

Description

CellCounter

A job with a a map and reduce phase to count cells in a table.

CellCounter.CellCounterMapper

Mapper that runs the count.

CellCounter.CellCounterMapper.Counters

Counter enumeration to count the actual rows.

CellCounter.LongSumReducer<Key>

CellCreator

Facade to create Cells for HFileOutputFormat.

CellSerialization

CellSerialization.CellDeserializer

CellSerialization.CellSerializer

CellSortReducer

Emits sorted Cells.

CopyTable

Tool used to copy a table to another one which can be on a different setup.

DefaultVisibilityExpressionResolver

This implementation creates tags by expanding expression using label ordinal.

Driver

Driver for hbase mapreduce jobs.

Export

Export an HBase table.

ExportUtils

Some helper methods are used by Export and org.apache.hadoop.hbase.coprocessor.Export (in hbase-endpooint).

ExtendedCellSerialization

Similar to CellSerialization, but includes the sequenceId from an ExtendedCell.

ExtendedCellSerialization.ExtendedCellDeserializer

ExtendedCellSerialization.ExtendedCellSerializer

GroupingTableMapper

Extract grouping columns from input record.

HashTable

HashTable.HashMapper

HashTable.ResultHasher

HashTable.TableHash

HFileInputFormat

Simple MR input format for HFiles.

HFileInputFormat.HFileRecordReader

Record reader for HFiles.

HFileOutputFormat2

Writes HFiles.

HFileOutputFormat2.TableInfo

HFileOutputFormat2.WriterLength

HRegionPartitioner<KEY,VALUE>

This is used to partition the output keys into groups of keys.

IdentityTableMapper

Pass the given key and record as-is to the reduce phase.

IdentityTableReducer

Convenience class that simply writes all values (which must be Put or Delete instances) passed to it out to the configured HBase table.

Import

Import data written by Export.

Import.CellImporter

A mapper that just writes out KeyValues.

Import.CellReducer

Import.CellSortImporter

Import.CellWritableComparable

Import.CellWritableComparable.CellWritableComparator

Import.CellWritableComparablePartitioner

Import.Importer

Write table content out to files in hdfs.

ImportTsv

Tool to import data from a TSV file.

ImportTsv.TsvParser

ImportTsv.TsvParser.BadTsvLineException

IndexBuilder

Example map/reduce job to construct index tables that can be used to quickly find a row based on the value of a column.

IndexBuilder.Map

Internal Mapper to be run by Hadoop.

JarFinder

Finds the Jar for a class.

JobUtil

Utility methods to interact with a job.

KeyOnlyCellComparable

KeyOnlyCellComparable.KeyOnlyCellComparator

MultiTableHFileOutputFormat

Create 3 level tree directory, first level is using table name as parent directory and then use family name as child directory, and all related HFiles for one family are under child directory -tableName1 -columnFamilyName1 -columnFamilyName2 -HFiles -tableName2 -columnFamilyName1 -HFiles -columnFamilyName2

MultiTableInputFormat

Convert HBase tabular data from multiple scanners into a format that is consumable by Map/Reduce.

MultiTableInputFormatBase

A base for MultiTableInputFormats.

MultiTableOutputFormat

Hadoop output format that writes to one or more HBase tables.

MultiTableOutputFormat.MultiTableRecordWriter

Record writer for outputting to multiple HTables.

MultiTableSnapshotInputFormat

MultiTableSnapshotInputFormat generalizes TableSnapshotInputFormat allowing a MapReduce job to run over one or more table snapshots, with one or more scans configured for each.

MultiTableSnapshotInputFormatImpl

Shared implementation of mapreduce code over multiple table snapshots.

MultithreadedTableMapper<K2,V2>

Multithreaded implementation for @link org.apache.hbase.mapreduce.TableMapper

MutationSerialization

MutationSerialization.MutationDeserializer

MutationSerialization.MutationSerializer

PreSortedCellsReducer

PutCombiner<K>

Combine Puts.

PutSortReducer

Emits sorted Puts.

RegionSizeCalculator

Computes size of each region for given table and given column families.

ResultSerialization

ResultSerialization.Result94Deserializer

The following deserializer class is used to load exported file of 0.94

ResultSerialization.ResultDeserializer

ResultSerialization.ResultSerializer

RoundRobinTableInputFormat

Process the return from super-class TableInputFormat (TIF) so as to undo any clumping of InputSplits around RegionServers.

RowCounter

A job with a just a map phase to count rows.

RowCounter.RowCounterCommandLineParser

RowCounter.RowCounterMapper

Mapper that runs the count.

RowCounter.RowCounterMapper.Counters

Counter enumeration to count the actual rows, cells and delete markers.

SampleUploader

Sample Uploader MapReduce

SampleUploader.Uploader

SimpleTotalOrderPartitioner<VALUE>

A partitioner that takes start and end keys and uses bigdecimal to figure which reduce a key belongs to.

SyncTable

SyncTable.SyncMapper

SyncTable.SyncMapper.CellScanner

SyncTable.SyncMapper.Counter

TableInputFormat

Convert HBase tabular data into a format that is consumable by Map/Reduce.

TableInputFormatBase

A base for TableInputFormats.

TableMapper<KEYOUT,VALUEOUT>

Extends the base Mapper class to add the required input key and value classes.

TableMapReduceUtil

Utility for TableMapper and TableReducer

TableOutputCommitter

Small committer class that does not do anything.

TableOutputFormat<KEY>

Convert Map/Reduce output and write it to an HBase table.

TableRecordReader

Iterate over an HBase table data, return (ImmutableBytesWritable, Result) pairs.

TableRecordReaderImpl

Iterate over an HBase table data, return (ImmutableBytesWritable, Result) pairs.

TableReducer<KEYIN,VALUEIN,KEYOUT>

Extends the basic Reducer class to add the required key and value input/output classes.

TableSnapshotInputFormat

TableSnapshotInputFormat allows a MapReduce job to run over a table snapshot.

TableSnapshotInputFormat.TableSnapshotRegionRecordReader

TableSnapshotInputFormat.TableSnapshotRegionSplit

TableSnapshotInputFormatImpl

Hadoop MR API-agnostic implementation for mapreduce over table snapshots.

TableSnapshotInputFormatImpl.InputSplit

Implementation class for InputSplit logic common between mapred and mapreduce.

TableSnapshotInputFormatImpl.RecordReader

Implementation class for RecordReader logic common between mapred and mapreduce.

TableSplit

A table split corresponds to a key range (low, high) and an optional scanner.

TableSplit.Version

TextSortReducer

Emits Sorted KeyValues.

TsvImporterMapper

Write table content out to files in hdfs.

TsvImporterTextMapper

Write table content out to map output files.

VisibilityExpressionResolver

Interface to convert visibility expressions into Tags for storing along with Cells in HFiles.

WALInputFormat

Simple InputFormat for WAL files.

WALInputFormat.WALKeyRecordReader

handler for non-deprecated WALKey version.

WALInputFormat.WALRecordReader<K extends WALKey>

RecordReader for an WAL file.

WALInputFormat.WALSplit

InputSplit for WAL files.

WALPlayer

A tool to replay WAL files as a M/R job.

WALPlayer.Counter

Enum for map metrics.

WALPlayer.WALKeyValueMapper

A mapper that just writes out KeyValues.

WALPlayer.WALMapper

A mapper that writes out Mutation to be directly applied to a running HBase instance.

Package org.apache.hadoop.hbase.mapreduce