hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Feedback on my implementation.
Date Fri, 09 May 2008 04:27:02 GMT
First, I don't think it's safe to use multiple colons in a column  
name. That will probably mess up some of the internals of HBase and  
get you inconsistent results.

Second, you talk about how the scanner will have to walk every row in  
the table. That's true no matter what. Scanners always traverse an  
entire row range, which by default is the entire table, but of course  
it can be constrained to a specific range.

Finally, I'm not exactly sure what you're trying to accomplish here.  
Are you trying to select rows by a cell values? If not, then what?  
Can you describe your use case a little more clearly, perhaps with a  
concrete example?


On May 8, 2008, at 3:35 PM, Josh wrote:

> Greetings,
> I am looking for some feedback on my use of HBase.
> To allow matching on column values, I have put data into the column
> family attribute name, for example:
> colum-fam:attribute1:value1
> colum-fam:attribute2:value2
> This allows one to match values in the following way:
> select colum-fam:attribute1:value1,colum-fam:attribute2:value2 from  
> MyTable;
> Programmatically, when I use:
> table.obtainscanner(new Text[] { new
> Text("colum-fam:attribute1:value1"), new
> Text("colum-fam:attribute2:value2") }, new Text(""));
> I get rows matching either value1 || value2, so I have logic that
> looks for both columns in each row to ensure an exact match.
> I am thinking this isn't an ideal implementation, as the scanner above
> must walk every row in the table.
> Any idea how this might scale?  Would adding RegionServers cut down on
> the time it takes to walk the whole table?
> Thanks for your input!

View raw message