hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: HBase => replication => Hive
Date Fri, 11 Mar 2011 16:40:26 GMT
Hi, 

I found the opposite. Depends on the queries but if you are not doing a full table scan the
direct HBase handler approach is actually faster as it is more fine grained than the usual
Hive partition granularity of a day or so. 

The scan can make use of row range selection and column families, reducing the scanned data
tremendously. Add time and bloom filter if applicable and the result is awesome. 

Lars

On Mar 11, 2011, at 9:52, Andrew Purtell <apurtell@apache.org> wrote:

> Pardon, I'm not as familiar with this area as I should, but
> 
>> apparently Hive queries run about x5
>> slower than queries that go against normal Hive tables.
> 
> Is this not a reasonable place to start? Why is this?
> 
>> I was wondering if people think it would be possible to
>> implement HBase=>Hive replication? 
> 
> This strikes me as non trivial. If doing this level of effort, why not look into the
Hive/HBase integration? Maybe there is something HBase can do to make it faster?
> 
> Best regards,
> 
>    - Andy
> 
> Problems worthy of attack prove their worth by hitting back.
>  - Piet Hein (via Tom White)
> 
> 
> --- On Thu, 3/10/11, Otis Gospodnetic <otis_gospodnetic@yahoo.com> wrote:
> 
>> From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
>> Subject: HBase => replication => Hive
>> To: user@hbase.apache.org
>> Date: Thursday, March 10, 2011, 10:43 PM
>> Hi,
>> 
>> Since HBase has a mechanism to replicate edit logs to
>> another HBase cluster, I was wondering if people think it
>> would be possible to implement HBase=>Hive 
>> replication? (and really make the destination pluggable
>> later on)
>> 
>> I'm asking because while one can integrate Hive and HBase
>> by creating external tables in Hive that actually point to
>> tables in HBase, apparently Hive queries run about x5
>> slower than queries that go against normal Hive tables.
>> 
>> And because all HBase export options are for 1 table at a
>> time and not point in time snapshots of the whole table,
>> exporting data from HBase and importing into Hive doesn't
>> sound like a viable option.
>> 
>> Thanks,
>> Otis
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop
>> Hadoop ecosystem search :: http://search-hadoop.com/
>> 
>> 
> 
> 
> 

Mime
View raw message