hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: How to read a subset of records based on a column value in a M/R job?
Date Thu, 18 Dec 2008 22:15:44 GMT
tigertail wrote:
...
> 		RowResult rowResult = this.table.getRow(msgid);
>
> With this revision, the job runs very stable now and takes 110 minutes to
> read 10M records.
> So for Q1, I can read 1M records in about 11 minutes, this looks ok.
>
>   
Good.  If you were interested in a particular column only, that should 
run faster (getRow is slow in that it has to make sure in all resources 
that it has picked up all possible columns that could be on the row 
whereas get with an explicit column knows it can stop when 
row+column+timestamp matches.  That said, all this will be faster when 
we get 0.19.0 out the door (In 0.19.0 it might help if the keys to get 
are sorted in that then the next value might come out of server-side 
blockcache) ... and faster again in 0.20.0.

St.Ack

Mime
View raw message