hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ding, Hui" <hui.d...@sap.com>
Subject RE: [LIKELY JUNK]Conditional scan by multiple columns
Date Tue, 04 Nov 2008 18:53:13 GMT
If you have an index you probably don't need the mapreduce 

-----Original Message-----
From: Mekin Maheshwari [mailto:mekin.m@gmail.com] 
Sent: Tuesday, November 04, 2008 3:49 AM
To: hbase-user@hadoop.apache.org
Subject: [LIKELY JUNK]Conditional scan by multiple columns

I am a newbie, just got HBase installed and started playing with it.

I want to perform something akin to :

select rows where columnFamilyA:columnM = 'X' and  columnFamilyB:columnN

>From what I have read, I would probably need to write a MapReduce task
this, possibly using GroupingTableMap

Before I embark on doing this, I wanted to understand if:
1. Is this is the right way to proceed, or am I missing other simpler
of achieving this.
2. What would be the performance implications of having queries with
different column's. Would I need to ensure that I have an index   on all
columns? What could be the size implications ?

To give you an idea of the eventual setup I want to be running this on:
Approx # of rows - 3Million
Number of column families : 20
Number of columns would range from 50 to 30,000

Thanks a ton,

Product I help build - http://weRead.com
Blog - http://mekin.livejournal.com/
Linkedin - http://www.linkedin.com/in/mekin

View raw message