Class MultithreadedTableMapper<K2,V2>
java.lang.Object
org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,KEYOUT,VALUEOUT>
org.apache.hadoop.hbase.mapreduce.TableMapper<K2,V2>
org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper<K2,V2>
Multithreaded implementation for @link org.apache.hbase.mapreduce.TableMapper
It can be used instead when the Map operation is not CPU bound in order to improve throughput.
Mapper implementations using this MapRunnable must be thread-safe.
The Map-Reduce job has to be configured with the mapper to use via setMapperClass(org.apache.hadoop.mapreduce.Job, java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.hbase.io.ImmutableBytesWritable, org.apache.hadoop.hbase.client.Result, K2, V2>>)
and
the number of thread the thread-pool can use with the getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext)
method. The
default value is 10 threads.
-
Nested Class Summary
Modifier and TypeClassDescriptionprivate class
private class
private class
private class
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context
-
Field Summary
Modifier and TypeFieldDescriptionprivate ExecutorService
private static final org.slf4j.Logger
private Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,
Result, K2, V2>> static final String
static final String
private org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,
Result, K2, V2>.org.apache.hadoop.mapreduce.Mapper.Context -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic <K2,
V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable, Result, K2, V2>> getMapperClass
(org.apache.hadoop.mapreduce.JobContext job) Get the application's mapper class.static int
getNumberOfThreads
(org.apache.hadoop.mapreduce.JobContext job) The number of threads in the thread pool that will run the map function.void
run
(org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable, Result, K2, V2>.org.apache.hadoop.mapreduce.Mapper.Context context) Run the application's maps using a thread pool.static <K2,
V2> void setMapperClass
(org.apache.hadoop.mapreduce.Job job, Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable, Result, K2, V2>> cls) Set the application's mapper class.static void
setNumberOfThreads
(org.apache.hadoop.mapreduce.Job job, int threads) Set the number of threads in the pool for running maps.Methods inherited from class org.apache.hadoop.mapreduce.Mapper
cleanup, map, setup
-
Field Details
-
LOG
-
mapClass
-
outer
private org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result, outerK2, V2>.org.apache.hadoop.mapreduce.Mapper.Context -
executor
-
NUMBER_OF_THREADS
- See Also:
-
MAPPER_CLASS
- See Also:
-
-
Constructor Details
-
MultithreadedTableMapper
public MultithreadedTableMapper()
-
-
Method Details
-
getNumberOfThreads
The number of threads in the thread pool that will run the map function.- Parameters:
job
- the job- Returns:
- the number of threads
-
setNumberOfThreads
Set the number of threads in the pool for running maps.- Parameters:
job
- the job to modifythreads
- the new number of threads
-
getMapperClass
public static <K2,V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result, getMapperClassK2, V2>> (org.apache.hadoop.mapreduce.JobContext job) Get the application's mapper class.- Type Parameters:
K2
- the map's output key typeV2
- the map's output value type- Parameters:
job
- the job- Returns:
- the mapper class to run
-
setMapperClass
public static <K2,V2> void setMapperClass(org.apache.hadoop.mapreduce.Job job, Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable, Result, K2, V2>> cls) Set the application's mapper class.- Type Parameters:
K2
- the map output key typeV2
- the map output value type- Parameters:
job
- the job to modifycls
- the class to use as the mapper
-
run
public void run(org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable, Result, throws IOException, InterruptedExceptionK2, V2>.org.apache.hadoop.mapreduce.Mapper.Context context) Run the application's maps using a thread pool.- Overrides:
run
in classorg.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,
Result, K2, V2> - Throws:
IOException
InterruptedException
-