public class MultithreadedTableMapper<K2,V2> extends TableMapper<K2,V2>
It can be used instead when the Map operation is not CPU bound in order to improve throughput.
Mapper implementations using this MapRunnable must be thread-safe.
The Map-Reduce job has to be configured with the mapper to use via
setMapperClass(org.apache.hadoop.mapreduce.Job, java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.hbase.io.ImmutableBytesWritable, org.apache.hadoop.hbase.client.Result, K2, V2>>)
and the number of thread the thread-pool can use with the
getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext)
method. The default value is 10 threads.
Modifier and Type | Class and Description |
---|---|
private class |
MultithreadedTableMapper.MapRunner |
private class |
MultithreadedTableMapper.SubMapRecordReader |
private class |
MultithreadedTableMapper.SubMapRecordWriter |
private class |
MultithreadedTableMapper.SubMapStatusReporter |
Modifier and Type | Field and Description |
---|---|
private ExecutorService |
executor |
private static org.apache.commons.logging.Log |
LOG |
private Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> |
mapClass |
static String |
MAPPER_CLASS |
static String |
NUMBER_OF_THREADS |
private org.apache.hadoop.mapreduce.Mapper.Context |
outer |
Constructor and Description |
---|
MultithreadedTableMapper() |
Modifier and Type | Method and Description |
---|---|
static <K2,V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> |
getMapperClass(org.apache.hadoop.mapreduce.JobContext job)
Get the application's mapper class.
|
static int |
getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext job)
The number of threads in the thread pool that will run the map function.
|
void |
run(org.apache.hadoop.mapreduce.Mapper.Context context)
Run the application's maps using a thread pool.
|
static <K2,V2> void |
setMapperClass(org.apache.hadoop.mapreduce.Job job,
Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> cls)
Set the application's mapper class.
|
static void |
setNumberOfThreads(org.apache.hadoop.mapreduce.Job job,
int threads)
Set the number of threads in the pool for running maps.
|
private static final org.apache.commons.logging.Log LOG
private Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> mapClass
private org.apache.hadoop.mapreduce.Mapper.Context outer
private ExecutorService executor
public static final String NUMBER_OF_THREADS
public static final String MAPPER_CLASS
public static int getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext job)
job
- the jobpublic static void setNumberOfThreads(org.apache.hadoop.mapreduce.Job job, int threads)
job
- the job to modifythreads
- the new number of threadspublic static <K2,V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> getMapperClass(org.apache.hadoop.mapreduce.JobContext job)
K2
- the map's output key typeV2
- the map's output value typejob
- the jobpublic static <K2,V2> void setMapperClass(org.apache.hadoop.mapreduce.Job job, Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> cls)
K2
- the map output key typeV2
- the map output value typejob
- the job to modifycls
- the class to use as the mapperpublic void run(org.apache.hadoop.mapreduce.Mapper.Context context) throws IOException, InterruptedException
run
in class org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>
IOException
InterruptedException
Copyright © 2007–2019 The Apache Software Foundation. All rights reserved.