Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 14834 invoked from network); 15 Feb 2010 03:59:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Feb 2010 03:59:01 -0000 Received: (qmail 45198 invoked by uid 500); 15 Feb 2010 03:59:00 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 45153 invoked by uid 500); 15 Feb 2010 03:59:00 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 45143 invoked by uid 99); 15 Feb 2010 03:59:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Feb 2010 03:59:00 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.216.185] (HELO mail-px0-f185.google.com) (209.85.216.185) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Feb 2010 03:58:52 +0000 Received: by pxi15 with SMTP id 15so798809pxi.21 for ; Sun, 14 Feb 2010 19:58:30 -0800 (PST) MIME-Version: 1.0 Received: by 10.114.236.9 with SMTP id j9mr2578808wah.9.1266206310446; Sun, 14 Feb 2010 19:58:30 -0800 (PST) In-Reply-To: References: <7c457ebe1002141429m712c19abm8426a75b20237ef4@mail.gmail.com> Date: Mon, 15 Feb 2010 14:58:30 +1100 Message-ID: <7c457ebe1002141958r2e766914m8d9cbec98690a4db@mail.gmail.com> Subject: Re: how to calculate top-xxx rowkeys From: Dan Washusen To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e64aecf658b808047f9ba1be X-Virus-Checked: Checked by ClamAV on apache.org --0016e64aecf658b808047f9ba1be Content-Type: text/plain; charset=ISO-8859-1 The javadocincludes a fair amount of information. There is also some testsin the HBase codebase... If you haven't tried map reduce before then I'd suggest starting at: http://hadoop.apache.org/common/docs/current/mapred_tutorial.html Cheers, Dan On 15 February 2010 14:44, Sujee Maniyam wrote: > A few hundred million rows for now, and will be more in the future. > > map-reduce proposal sounds very interesting. Any pointers on running > MR jobs on data stored in Hbase? > > thanks very much > sujee > > > > On Sun, Feb 14, 2010 at 2:29 PM, Dan Washusen wrote: > > Hi Sujee, > > How much data do you have in your table? Keeping a count in memory has > it's > > obvious problems but if it's a small table then I guess it would work... > > > > How fast do you need to get this information? Maybe a map reduce job > would > > be a better way of doing it? > > > > Cheers, > > Dan > > > > > > On 14 February 2010 19:56, Sujee Maniyam wrote: > > > >> HI > >> > >> I have a table with rowkey is composed of userid + timestamp. I need > >> to figure out 'top-100' users. > >> > >> One approach is running a scanner and keeping a hashmap of user-count in > >> memory. > >> > >> Wondering if there is an hbase-trick I could use? > >> > >> thanks > >> Sujee > >> > > > --0016e64aecf658b808047f9ba1be--