hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatsuya Kawano <tatsuya6...@gmail.com>
Subject Re: How to improve the speed of HTable scan
Date Wed, 26 Jan 2011 04:50:35 GMT
Hi, 

This is because of the server-side block cache. RS reads a block of rows from HDFS and keep
the block in its cache for a while. The first next() takes longer because the RS serves the
row from HDFS, and other next() are faster because they are served from RS's cache. 

Thanks, 

--
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Jan 26, 2011, at 1:23 PM, 陈加俊 <cjjvictory@gmail.com> wrote:

> final Scan scan = new Scan();
> scan.setCaching(1);
> scan.addColumn(family);
> ResultScanner  rs=table.getScanner(scan);
> 
> the speed is:
> 
> getTableScanner 2.28
> next 27s832.12ms
> next 0.99ms
> next 0.94ms
> next 0.82ms
> next 0.94ms
> next 0.88ms
> next 0.95ms
> next 0.94ms
> next 1.37ms
> next 0.90ms
> next 0.89ms
> next 0.90ms
> next 0.91ms
> next 0.86ms
> next 1.23ms
> next 0.87ms
> next 0.87ms
> next 0.83ms
> next 0.87ms
> next 0.90ms
> next 0.91ms
> next 1.73ms
> next 0.98ms
> next 0.89ms
> next 0.86ms
> next 0.92ms
> next 1.33ms
> next 0.87ms
> next 0.89ms
> next 0.82ms
> next 0.87ms
> next 0.84ms
> next 0.94ms
> next 0.96ms
> next 0.93ms
> next 0.79ms
> next 0.82ms
> next 0.84ms
> next 0.84ms
> next 0.87ms
> next 1.17ms
> next 0.80ms
> next 1.25ms
> next 1.08ms
> next 1.08ms
> next 1.95ms
> next 1.66ms
> 
> ....
> 
> getTableScanner 0.98ms
> next 16s258.33ms
> next 0.95ms
> next 1.10ms
> next 1.06ms
> next 0.90ms
> next 2.13ms
> next 2.31ms
> next 1.02ms
> next 1.38ms
> next 0.97ms
> next 0.90ms
> next 0.85ms
> next 2.03ms
> next 1.01ms
> next 1.35ms
> next 1.05ms
> next 1.06ms
> next 1.02ms
> next 1.28ms
> next 0.94ms
> next 1.35ms
> next 0.86ms
> next 0.86ms
> next 0.88ms
> next 0.83ms
> next 0.92ms
> next 0.92ms
> next 1.09ms
> next 0.91ms
> ...
> 
> Why the first next is too slowly ?
> 
> HBase-0.20.6
> 
> 
> 
> On Wed, Jan 26, 2011 at 2:09 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
> 
>> Caching is the number of rows that will be fetched per RPC, depending
>> on how big your rows are you might want to set it larger or smaller.
>> Try 10 then do some experiments.
>> 
>> There's not that many more, speed of reading is always improved with
>> caching. Make sure your data can fit in the block cache and that it
>> stays there.
>> 
>> J-D
>> 
>> On Tue, Jan 25, 2011 at 2:35 AM, 陈加俊 <cjjvictory@gmail.com> wrote:
>>> final Scan scan = new Scan();
>>> scan.setCaching(scannerCaching);
>>> scan.addColumn(family);
>>> 
>>> table.getScanner(scan);
>>> 
>>> For improving  the speed of scan .
>>> How to adjust the parameters ? Is there any more parameters or methods
>> that
>>> I don't know.
>>> 
>> 
> 
> 
> 
> -- 
> Thanks & Best regards
> jiajun

Mime
View raw message