hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From manobal <mano...@gmail.com>
Subject TableMapReduceUtil.initTableMapperJob takes only 1 scan object
Date Thu, 27 Jan 2011 20:01:05 GMT

I am just trying to evaluate HBase for some of data analysis stuff we are
doing. 

HBase would contain our event data. Key would be eventId + time. We want to
run analysis on few events types (4-5) between a date range. Total number of
event type is around 1000.

The problem with running mapreduce job on the hbase table is that
initTableMapperJob (see below) takes only 1 scan object. For performance
reason we want to scan the data for only 4-5 events in a give date range and
not the 1000 events. If we use the method below then I guess we don't have
that choice.    

public static void initTableMapperJob(String table,
                                      Scan scan,
                                      Class<? extends TableMapper> mapper,
                                      Class<? extends
org.apache.hadoop.io.WritableComparable> outputKeyClass,
                                      Class<? extends
org.apache.hadoop.io.Writable> outputValueClass,
                                      org.apache.hadoop.mapreduce.Job job)
                               throws IOException

Is it possible to run mapreduce on a list of scan objects? any workaround?

Thanks
-- 
View this message in context: http://old.nabble.com/TableMapReduceUtil.initTableMapperJob-takes-only-1-scan-object-tp30778208p30778208.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message