hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈加俊 <cjjvict...@gmail.com>
Subject Re: How to improve the speed of HTable scan
Date Sat, 29 Jan 2011 05:32:16 GMT
final Pair<byte[][], byte[][]> ranges = table.getStartEndKeys();
 final byte[][] startKeys = ranges.getFirst();
  final byte[][] endKeys = ranges.getSecond();

I scan the first range and delete the rows,and get the second range and
delete the rows. But I find the ranges is not changed after about 5 minutes.

2011/1/26 Stack <stack@duboce.net>

> We do an RPC to open a Scanner.  The open returns Scanner id to use
> subsequently.  We then next.  If caching is not enabled, we'll do an
> RPC per next invocation.    Otherwise every N invocations.  Perhaps
> the slowdown on first next is because its doing open RPC then a new
> RPC that is pre-fetching a bunch of rows?
>
> St.Ack
>
> On Tue, Jan 25, 2011 at 9:36 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> > Thank you !
> >
> > But why the second (and subsequent ones)  that getScanner and first next
> is
> > too slowly? I think the second (and subsequent ones)  should be faster .
> >
> >
> > On Wed, Jan 26, 2011 at 12:50 PM, Tatsuya Kawano <tatsuya6502@gmail.com
> >wrote:
> >
> >> Hi,
> >>
> >> This is because of the server-side block cache. RS reads a block of rows
> >> from HDFS and keep the block in its cache for a while. The first next()
> >> takes longer because the RS serves the row from HDFS, and other next()
> are
> >> faster because they are served from RS's cache.
> >>
> >> Thanks,
> >>
> >> --
> >> Tatsuya Kawano (Mr.)
> >> Tokyo, Japan
> >>
> >>
> >> On Jan 26, 2011, at 1:23 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> >>
> >> > final Scan scan = new Scan();
> >> > scan.setCaching(1);
> >> > scan.addColumn(family);
> >> > ResultScanner  rs=table.getScanner(scan);
> >> >
> >> > the speed is:
> >> >
> >> > getTableScanner 2.28
> >> > next 27s832.12ms
> >> > next 0.99ms
> >> > next 0.94ms
> >> > next 0.82ms
> >> > next 0.94ms
> >> > next 0.88ms
> >> > next 0.95ms
> >> > next 0.94ms
> >> > next 1.37ms
> >> > next 0.90ms
> >> > next 0.89ms
> >> > next 0.90ms
> >> > next 0.91ms
> >> > next 0.86ms
> >> > next 1.23ms
> >> > next 0.87ms
> >> > next 0.87ms
> >> > next 0.83ms
> >> > next 0.87ms
> >> > next 0.90ms
> >> > next 0.91ms
> >> > next 1.73ms
> >> > next 0.98ms
> >> > next 0.89ms
> >> > next 0.86ms
> >> > next 0.92ms
> >> > next 1.33ms
> >> > next 0.87ms
> >> > next 0.89ms
> >> > next 0.82ms
> >> > next 0.87ms
> >> > next 0.84ms
> >> > next 0.94ms
> >> > next 0.96ms
> >> > next 0.93ms
> >> > next 0.79ms
> >> > next 0.82ms
> >> > next 0.84ms
> >> > next 0.84ms
> >> > next 0.87ms
> >> > next 1.17ms
> >> > next 0.80ms
> >> > next 1.25ms
> >> > next 1.08ms
> >> > next 1.08ms
> >> > next 1.95ms
> >> > next 1.66ms
> >> >
> >> > ....
> >> >
> >> > getTableScanner 0.98ms
> >> > next 16s258.33ms
> >> > next 0.95ms
> >> > next 1.10ms
> >> > next 1.06ms
> >> > next 0.90ms
> >> > next 2.13ms
> >> > next 2.31ms
> >> > next 1.02ms
> >> > next 1.38ms
> >> > next 0.97ms
> >> > next 0.90ms
> >> > next 0.85ms
> >> > next 2.03ms
> >> > next 1.01ms
> >> > next 1.35ms
> >> > next 1.05ms
> >> > next 1.06ms
> >> > next 1.02ms
> >> > next 1.28ms
> >> > next 0.94ms
> >> > next 1.35ms
> >> > next 0.86ms
> >> > next 0.86ms
> >> > next 0.88ms
> >> > next 0.83ms
> >> > next 0.92ms
> >> > next 0.92ms
> >> > next 1.09ms
> >> > next 0.91ms
> >> > ...
> >> >
> >> > Why the first next is too slowly ?
> >> >
> >> > HBase-0.20.6
> >> >
> >> >
> >> >
> >> > On Wed, Jan 26, 2011 at 2:09 AM, Jean-Daniel Cryans <
> jdcryans@apache.org
> >> >wrote:
> >> >
> >> >> Caching is the number of rows that will be fetched per RPC, depending
> >> >> on how big your rows are you might want to set it larger or smaller.
> >> >> Try 10 then do some experiments.
> >> >>
> >> >> There's not that many more, speed of reading is always improved with
> >> >> caching. Make sure your data can fit in the block cache and that it
> >> >> stays there.
> >> >>
> >> >> J-D
> >> >>
> >> >> On Tue, Jan 25, 2011 at 2:35 AM, 陈加俊 <cjjvictory@gmail.com>
wrote:
> >> >>> final Scan scan = new Scan();
> >> >>> scan.setCaching(scannerCaching);
> >> >>> scan.addColumn(family);
> >> >>>
> >> >>> table.getScanner(scan);
> >> >>>
> >> >>> For improving  the speed of scan .
> >> >>> How to adjust the parameters ? Is there any more parameters or
> methods
> >> >> that
> >> >>> I don't know.
> >> >>>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks & Best regards
> >> > jiajun
> >>
> >
> >
> >
> > --
> > Thanks & Best regards
> > jiajun
> >
>



-- 
Thanks & Best regards
jiajun

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message