hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: HBase => replication => Hive
Date Fri, 11 Mar 2011 19:13:48 GMT
Hi,


----- Original Message ----

> From: Andrew Purtell <apurtell@apache.org>
> 
> Pardon, I'm not as familiar with this area as I should, but
> 
> >  apparently Hive queries run about x5
> > slower than queries that go against  normal Hive tables.
> 
> Is this not a reasonable place to start? Why is  this?

Reasonable?  I don't know. :)  That's really the first thing I was hoping to 
find out.  J-Ds reaction makes it sound like this is not unreasonable.

> > I was wondering if people think it would be possible  to
> > implement HBase=>Hive replication? 
> 
> This strikes me as non  trivial. If doing this level of effort, why not look 
>into the Hive/HBase  integration? Maybe there is something HBase can do to make 
>it  faster?


At this point I don't know how trivial or non-trivial it is yet.  But I thought 
that if John Sichi, who strikes me as a pretty smart fellow, says he's seeing x5 
performance loss and he's the one who worked on the integration, getting from 5 
to 4 or lower may be non-trivial.  HBase => Hive is terra incognita so, who 
knows, maybe it's easy to do. :)

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


> Best regards,
> 
>     - Andy
> 
> Problems worthy  of attack prove their worth by hitting back.
>   - Piet Hein (via Tom  White)
> 
> 
> --- On Thu, 3/10/11, Otis Gospodnetic <otis_gospodnetic@yahoo.com>  wrote:
> 
> > From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
> >  Subject: HBase => replication => Hive
> > To: user@hbase.apache.org
> > Date:  Thursday, March 10, 2011, 10:43 PM
> > Hi,
> > 
> > Since HBase has  a mechanism to replicate edit logs to
> > another HBase cluster, I was  wondering if people think it
> > would be possible to implement  HBase=>Hive 
> > replication? (and really make the destination  pluggable
> > later on)
> > 
> > I'm asking because while one can  integrate Hive and HBase
> > by creating external tables in Hive that  actually point to
> > tables in HBase, apparently Hive queries run about  x5
> > slower than queries that go against normal Hive tables.
> > 
> > And because all HBase export options are for 1 table at a
> > time  and not point in time snapshots of the whole table,
> > exporting data from  HBase and importing into Hive doesn't
> > sound like a viable  option.
> > 
> > Thanks,
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Hadoop
> > Hadoop ecosystem search :: http://search-hadoop.com/
> > 
> > 
> 
> 
>       
> 

Mime
View raw message