hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Sichi <jsi...@facebook.com>
Subject Re: Question regarding region scans in HBase integration
Date Sun, 12 Sep 2010 02:09:57 GMT
Hi Daniel,

I'm almost done with this for HIVE-1226; the remaining step I need to finish is to get the
filter passed down during getSplits, since the HBase getSplits implementation takes care of
figuring out which regions contain the row in question.


On Sep 11, 2010, at 7:00 PM, Daniel Einspanjer wrote:

> I was trying to spend a little time this weekend catching up with the current state of
HBase integration for Hive.  One thing that I haven't seen mentioned is how exactly Hive scans
an HBase table during a SELECT.
> Does Hive have logic that allows it to intelligently scan only the participating regions
during a SELECT query that uses the rowkey?  If not, I recently wrote some code that allows
a MapReduce job to effectively select the regions based on a list of start/end rowkey ranges.
 If this might be useful to the Hive integration, I could create a Jira and take a look at
trying to set up a patch.
> Daniel Einspanjer
> Metrics Architect
> Mozilla Corporation

View raw message