From cassandra-user-return-831-apmail-incubator-cassandra-user-archive=incubator.apache.org@incubator.apache.org Tue Oct 06 22:33:49 2009 Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 99242 invoked from network); 6 Oct 2009 22:33:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Oct 2009 22:33:49 -0000 Received: (qmail 94167 invoked by uid 500); 6 Oct 2009 22:33:49 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 94123 invoked by uid 500); 6 Oct 2009 22:33:49 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 94114 invoked by uid 99); 6 Oct 2009 22:33:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Oct 2009 22:33:49 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jmischo@quagility.com designates 216.154.210.211 as permitted sender) Received: from [216.154.210.211] (HELO quagility.com) (216.154.210.211) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Oct 2009 22:33:37 +0000 Received: from [10.0.69.97] (adsl-69-210-247-46.dsl.chcgil.ameritech.net [69.210.247.46]) (authenticated bits=0) by quagility.com (8.13.1/8.13.1) with ESMTP id n96MVEbg008077 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Tue, 6 Oct 2009 17:31:15 -0500 Message-Id: <7009C110-DEA6-4772-A33C-0DDE302B964D@quagility.com> From: Jonathan Mischo To: cassandra-user@incubator.apache.org In-Reply-To: <23b1e84e0910060914v1f1a8865i65244324e0715b56@mail.gmail.com> Content-Type: multipart/alternative; boundary=Apple-Mail-6--464316331 Mime-Version: 1.0 (Apple Message framework v935.3) Subject: Re: Storage proxy write latency is too high Date: Tue, 6 Oct 2009 17:31:14 -0500 References: <23b1e84e0910011551h49a6e8a0xaba1d2d6290c42e6@mail.gmail.com> <23b1e84e0910051146q528c1df0r803821dabfe9b61c@mail.gmail.com> <23b1e84e0910051217m75e86a07n5df57d3c5e183fca@mail.gmail.com> <23b1e84e0910060914v1f1a8865i65244324e0715b56@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-6--464316331 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Back in the day, I was involved in performance testing JVMs, Solaris on Intel, and SWS at SunSoft, and our lab actually ran not just our own numbers, but competitors' numbers as well, for comparison. One thing we discovered, when analyzing some performance issues we'd seen with Windows clients, was that Windows' network stacks don't behave as you'd expect. They very regularly transmitted packets out of sequence. I'm not sure if this is still the case, as this was the late 90's, but we discovered this when we were working on SPECweb numbers (we were a SPEC lab) for SWS and, in looking at the JVM and other system settings to understand why we were seeing unexpectedly large TCP buffers. When we started sniffing packets directly from the Windows clients, we discovered the packets were being emitted out of sequence, which was causing the server to require larger per-connection buffers and was pushing TCP window size boundaries. Now, Cassandra uses UDP primarily, but we never tested to find out whether this was a TCP stack issue, IP stack issue, or Ethernet stack issue, so it may be a similar case. Since reassembly still has to happen, if the packets are being transmitted out of order, it makes sense that your write latency would be significantly higher. I'd be interested to see the results if you dig deeper into this. -Jon On Oct 6, 2009, at 11:14 AM, Igor Katkov wrote: > I think I finally found what. It's implementation of Java NIOon > Windows (JVM 1.6.0.16, 64b on Windows 2003) > The very same code, same network but CentOS linux gives almost 4x > performance. (in Cassandra@linux -> Cassandra@Windows setup) > I don't have another linux box to test (Cassandra@linux -> > Cassandra@linux) performance, but expect it to be even better. > > A lesson learnt: don't use windows. > > P.S. > Here at Viigo we also learnt the hard way that async IO is also > broken in .NET (C#). Now I start to wonder if there is some > fundamental flaw in async IO on windows... > > On Mon, Oct 5, 2009 at 3:23 PM, Jonathan Ellis > wrote: > On Mon, Oct 5, 2009 at 2:17 PM, Igor Katkov wrote: > > measured via JMX console i.e. does not include client-cassandra- > client > > latency > > > > 20 client threads 176975b value StorageProxy.WriteLatency ~660ms > > 10 client threads 176975b value StorageProxy.WriteLatency ~350ms > > 05 client threads 176975b value StorageProxy.WriteLatency ~156ms > > this is going up basically linearly with amount of (data x clients), > so clearly something is getting saturated. > --Apple-Mail-6--464316331 Content-Type: text/html; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Back in the day, I was involved = in performance testing JVMs, Solaris on Intel, and SWS at SunSoft, and = our lab actually ran not just our own numbers, but competitors' numbers = as well, for comparison.  One thing we discovered, when analyzing = some performance issues we'd seen with Windows clients, was that = Windows' network stacks don't behave as you'd expect.  They very = regularly transmitted packets out of sequence.

I'm = not sure if this is still the case, as this was the late 90's, but we = discovered this when we were working on SPECweb numbers (we were a SPEC = lab) for SWS and, in looking at the JVM and other system settings to = understand why we were seeing unexpectedly large TCP buffers.  When = we started sniffing packets directly from the Windows clients, we = discovered the packets were being emitted out of sequence, which was = causing the server to require larger per-connection buffers and was = pushing TCP window size boundaries.

Now, = Cassandra uses UDP primarily, but we never tested to find out whether = this was a TCP stack issue, IP stack issue, or Ethernet stack issue, so = it may be a similar case.  Since reassembly still has to happen, if = the packets are being transmitted out of order, it makes sense that your = write latency would be significantly = higher.

I'd be interested to see the results if = you dig deeper into = this.

-Jon

On Oct 6, = 2009, at 11:14 AM, Igor Katkov wrote:

I think I = finally found what. It's implementation of Java NIOon Windows (JVM = 1.6.0.16, 64b on Windows 2003)
The very same code, same network but = CentOS linux gives almost 4x performance. (in Cassandra@linux -> = Cassandra@Windows setup)
I don't have another linux box to test = (Cassandra@linux -> Cassandra@linux) performance, but expect it to be = even better.

A lesson learnt: don't use windows.

P.S. =
Here at Viigo we also learnt the hard way that  async IO is = also broken in .NET (C#). Now I start to wonder if there is some = fundamental flaw in async IO on windows...

On Mon, Oct 5, 2009 at 3:23 PM, Jonathan Ellis = <jbellis@gmail.com> = wrote:
On Mon, Oct 5, 2009 at 2:17 PM, Igor Katkov = <ikatkov@gmail.com> = wrote:
> measured via JMX console i.e. does not include = client-cassandra-client
> latency
>
> 20 client threads = 176975b value StorageProxy.WriteLatency ~660ms
> 10 client threads = 176975b value StorageProxy.WriteLatency ~350ms
> 05 client threads = 176975b value StorageProxy.WriteLatency ~156ms

this is = going up basically linearly with amount of (data x clients),
so = clearly something is getting saturated.
=


= --Apple-Mail-6--464316331--