hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Washusen <...@reactive.org>
Subject Re: how to calculate top-xxx rowkeys
Date Mon, 15 Feb 2010 03:58:30 GMT
The javadoc<http://hadoop.apache.org/hbase/docs/r0.20.3/api/org/apache/hadoop/hbase/mapreduce/package-summary.html>includes
a fair amount of information.  There is also some
tests<http://svn.apache.org/repos/asf/hadoop/hbase/branches/0.20/src/test/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java>in
the HBase codebase...

If you haven't tried map reduce before then I'd suggest starting at:
http://hadoop.apache.org/common/docs/current/mapred_tutorial.html

Cheers,
Dan

On 15 February 2010 14:44, Sujee Maniyam <sujee@sujee.net> wrote:

> A few hundred million rows for now, and will be more in the future.
>
> map-reduce proposal sounds very interesting.  Any pointers on running
> MR jobs on data stored in Hbase?
>
> thanks very much
> sujee
>
>
>
> On Sun, Feb 14, 2010 at 2:29 PM, Dan Washusen <dan@reactive.org> wrote:
> > Hi Sujee,
> > How much data do you have in your table?  Keeping a count in memory has
> it's
> > obvious problems but if it's a small table then I guess it would work...
> >
> > How fast do you need to get this information?  Maybe a map reduce job
> would
> > be a better way of doing it?
> >
> > Cheers,
> > Dan
> >
> >
> > On 14 February 2010 19:56, Sujee Maniyam <sujee@sujee.net> wrote:
> >
> >> HI
> >>
> >> I have a table with rowkey is composed of userid + timestamp. I need
> >> to figure out 'top-100' users.
> >>
> >> One approach is running a scanner and keeping a hashmap of user-count in
> >> memory.
> >>
> >> Wondering if there is an hbase-trick I could use?
> >>
> >> thanks
> >> Sujee
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message