hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: HBase aggregate query
Date Tue, 11 Sep 2012 17:59:16 GMT
That's when you aggregate along a sorted dimension (prefix of the key), though. Right?
Not sure how smart Hive is here, but if it needs to sort the data it will probably be slower
than SQL Server for such a small data set.

----- Original Message -----
From: James Taylor <jtaylor@salesforce.com>
To: user@hbase.apache.org
Sent: Monday, September 10, 2012 5:49 PM
Subject: Re: HBase aggregate query

iwannaplay games <funnlearnforkids@...> writes:
> Hi ,
> I want to run query like
> select month(eventdate),scene,count(1),sum(timespent) from eventlog
> group by month(eventdate),scene
> in hbase.Through hive its taking a lot of time for 40 million
> records.Do we have any syntax in hbase to find its result?In sql
> server it takes around 9 minutes,How long it might take in hbase??
> Regards
> Prabhjot

In our internal testing using server-side coprocessors for aggregation, we've
found HBase can process these types of queries very quickly: ~10-12 seconds
using a four node cluster. You need to chunk up and parallelize the work on the
client side to get this kind of performance, though.


View raw message