hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Forehand <luke.foreh...@networkedinsights.com>
Subject Re: Secondary Index versus Full Table Scan
Date Tue, 03 Aug 2010 21:24:52 GMT
Hegner, Travis <THegner@...> writes:

> 
> Going out on a limb, I think it will perform MUCH faster with multiple copies,
as the data is already sitting
> in each mappers memory, ready to be accessed locally. The time to process per
mapper should be very
> dramatically reduced. With that in mind, you only have to scale up as disk
space requires it, and disk space
> is cheap.
> 
> With your current method, adding three more identical data nodes, is only
going to cut your time in half. So
> unless you have the budget to get the number of machines required, it's at
least worth a try to have multiple
> copies, at least that only costs your time.
> 
> HTH,
> 
> Travis Hegner
> http://www.travishegner.com/
> 

Thanks Travis!  I am in the process of making a copy of our master table with a
composite rowKey of '<columnToIndex> <masterRowKey>'

I'll be testing out range scans using the composite key shortly.

-Luke


Mime
View raw message