hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Open Scanner Latency
Date Mon, 31 Jan 2011 21:54:51 GMT
On Mon, Jan 31, 2011 at 1:38 PM, Wayne <wav100@gmail.com> wrote:
> After doing many tests (10k serialized scans) we see that on average opening
> the scanner takes 2/3 of the read time if the read is fresh
> (scannerOpenWithStop=~35ms, scannerGetList=~10ms).

I've saw that this w/e.  The getScanner takes all the time.  Tracing,
it did not seem to be locating regions in cluster but suspect would
seem to be down in StoreScanner when we seek all the StoreFiles.  I
didn't look beyond that (this w/e that is).

What if you do full table scan of the data that you want hot on a
period (making sure you Scan with the skip cache button OFF)?  Is the
data you want cached all in one table?  Does marking the table
in-memory help?

> A read's latency for our type of usage pattern should be based
> primarily on disk i/o latency and not looking around for where the data is
> located in the cluster. Adding SSD disks wouldn't help us much at all to
> lower read latency given what we are seeing.

You think that its locating data in the cluster?  Long-lived clients
shouldn't be doing lookups, they should have cached all seen region
locations, not unless the region moved.  Do you think that is what is
happening Wayne?

Here is an interesting article on SSDs and Cassandra:
http://blog.kosmix.com/?p=1445  Speculation is that SSDs don't really
improve latency given the size of reads done by cass (and hbase) but
rather, they help keep latency about constant when lots of contending
cilents; i.e. maybe we could have one cluster at SU only if we used


View raw message