hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Einspanjer <>
Subject Question regarding region scans in HBase integration
Date Sun, 12 Sep 2010 02:00:39 GMT
  I was trying to spend a little time this weekend catching up with the 
current state of HBase integration for Hive.  One thing that I haven't 
seen mentioned is how exactly Hive scans an HBase table during a SELECT.

Does Hive have logic that allows it to intelligently scan only the 
participating regions during a SELECT query that uses the rowkey?  If 
not, I recently wrote some code that allows a MapReduce job to 
effectively select the regions based on a list of start/end rowkey 
ranges.  If this might be useful to the Hive integration, I could create 
a Jira and take a look at trying to set up a patch.

Daniel Einspanjer
Metrics Architect
Mozilla Corporation

View raw message