incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weijun Li <weiju...@gmail.com>
Subject Re: Cassandra benchmark shows OK throughput but high read latency (> 100ms)?
Date Wed, 17 Feb 2010 00:15:43 GMT
Still have high read latency with 50mil records in the 2-node cluster
(replica 2). I restarted both nodes but read latency is still above 60ms and
disk i/o saturation is high. Tried compact and repair but doesn't help much.
When I reduced the client threads from 15 to 5 it looks a lot better but
throughput is kind of low. I changed using flushing thread of 16 instead the
defaulted 8, could that cause the disk saturation issue?

For benchmark with decent throughput and latency, how many client threads do
they use? Can anyone share your storage-conf.xml in well-tuned high volume
cluster?

-Weijun

On Tue, Feb 16, 2010 at 10:31 AM, Stu Hood <stu.hood@rackspace.com> wrote:

> > After I ran "nodeprobe compact" on node B its read latency went up to
> 150ms.
> The compaction process can take a while to finish... in 0.5 you need to
> watch the logs to figure out when it has actually finished, and then you
> should start seeing the improvement in read latency.
>
> > Is there any way to utilize all of the heap space to decrease the read
> latency?
> In 0.5 you can adjust the number of keys that are cached by changing the
> 'KeysCachedFraction' parameter in your config file. In 0.6 you can
> additionally cache rows. You don't want to use up all of the memory on your
> box for those caches though: you'll want to leave at least 50% for your OS's
> disk cache, which will store the full row content.
>
>
> -----Original Message-----
> From: "Weijun Li" <weijunli@gmail.com>
> Sent: Tuesday, February 16, 2010 12:16pm
> To: cassandra-user@incubator.apache.org
> Subject: Re: Cassandra benchmark shows OK throughput but high read latency
> (> 100ms)?
>
> Thanks for for DataFileDirectory trick and I'll give a try.
>
> Just noticed the impact of number of data files: node A has 13 data files
> with read latency of 20ms and node B has 27 files with read latency of
> 60ms.
> After I ran "nodeprobe compact" on node B its read latency went up to
> 150ms.
> The read latency of node A became as low as 10ms. Is this normal behavior?
> I'm using random partitioner and the hardware/JVM settings are exactly the
> same for these two nodes.
>
> Another problem is that Java heap usage is always 900mb out of 6GB? Is
> there
> any way to utilize all of the heap space to decrease the read latency?
>
> -Weijun
>
> On Tue, Feb 16, 2010 at 10:01 AM, Brandon Williams <driftx@gmail.com>
> wrote:
>
> > On Tue, Feb 16, 2010 at 11:56 AM, Weijun Li <weijunli@gmail.com> wrote:
> >
> >> One more thoughts about Martin's suggestion: is it possible to put the
> >> data files into multiple directories that are located in different
> physical
> >> disks? This should help to improve the i/o bottleneck issue.
> >>
> >>
> > Yes, you can already do this, just add more <DataFileDirectory>
> directives
> > pointed at multiple drives.
> >
> >
> >> Has anybody tested the row-caching feature in trunk (shoot for 0.6?)?
> >
> >
> > Row cache and key cache both help tremendously if your read pattern has a
> > decent repeat rate.  Completely random io can only be so fast, however.
> >
> > -Brandon
> >
>
>
>

Mime
View raw message