hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kilbride, James P." <James.Kilbr...@gd-ais.com>
Subject RE: MapReduce HBASE examples
Date Tue, 06 Jul 2010 17:02:02 GMT
So, if that's the case, and you argument makes sense understanding how scan versus get works,
I'd have to write a custom InputFormat class that looks like the TableInputFormat class, but
uses a get(or series of gets) rather than a scan object as the current table mapper does?


James Kilbride

-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Tuesday, July 06, 2010 12:53 PM
To: general@hadoop.apache.org
Subject: Re: MapReduce HBASE examples

>
>
> Does this make any sense?
>
>
Not in a MapReduce context, what you want to do is a LIKE with a bunch of
values right? Since a mapper will always read all the input that it's given
(minus some filters like you can do with HBase), whatever you do will always
end up being a full table scan. You "could" solve your problem by
configuring your Scan object with a RowFilter that knows about the names you
are looking for, but that still ends up being a full scan on the region
server side so it will be slow and will generate a lot of IO.

WRT examples, HBase ships with a couple of utility classes that can also be
used as examples. The Export class has the Scan configuration stuff:
http://github.com/apache/hbase/blob/0.20/src/java/org/apache/hadoop/hbase/mapreduce/Export.java

J-D

Mime
View raw message