hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Sichi <>
Subject RE: How to use Hive for HBase
Date Wed, 19 May 2010 20:11:29 GMT
Currently you need to tell Hive about the column information (what names to use in Hive, and
how they map into colfamily:colname in HBase) as part of your CREATE EXTERNAL TABLE statement.

We could support some kind of default mapping in Hive for CREATE EXTERNAL TABLE, but that
might not get what you want correctly.  Instead, you can write a Java utility to read HBase
metadata and construct a CREATE EXTERNAL TABLE string exactly the way you want it. 

From: Ray Duong []
Sent: Wednesday, May 19, 2010 1:02 PM
Subject: Re: How to use Hive for HBase

Hi John,

Is there any easy way to dump the HBase data into Hive, (via HBase export) and have Hive read
it without knowing all the column qualifier?


On Wed, May 19, 2010 at 11:10 AM, John Sichi <<>>
It's the usual tradeoff.

One approach is ETL (pump the data from HBase into Hive and then analyze it there).  The benefit
is that once the data is in Hive, queries against it will typically run faster (since Hive
is optimized for warehousing).  The drawback is staleness:  you won't be querying the very
latest data.

The other approach is direct queries against the latest data in HBase:  up-to-date data, but
slower query performance (and adding load to your HBase cluster).

You may consider using both approaches:  do ETL, and for most queries, run against the Hive
data, but when you need the latest, hit HBase.


From: SingoWong [<>]
Sent: Wednesday, May 19, 2010 2:27 AM
Subject: How to use Hive for HBase


I got a confused for Hive and HBase.
HBase to be a database, and Hive to be a warehouse, if i wanna wanna to statistics and analysis
the data from warehouse, and my source data is put on HBase, so, should i move my data from
HBase to Hive?

Thanks & Regards,

View raw message