Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81CAA75DD for ; Sun, 18 Sep 2011 04:23:33 +0000 (UTC) Received: (qmail 97655 invoked by uid 500); 18 Sep 2011 04:23:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 97620 invoked by uid 500); 18 Sep 2011 04:23:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 97603 invoked by uid 99); 18 Sep 2011 04:23:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Sep 2011 04:23:30 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,TRACKER_ID,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anthony.ikeda.dev@gmail.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Sep 2011 04:23:21 +0000 Received: by pzk6 with SMTP id 6so6501859pzk.7 for ; Sat, 17 Sep 2011 21:22:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer; bh=1ZASi9ws63hJT44AA3pQ3gIjQQEvWuY9W3LhxlWxLPo=; b=rnmTNU3/tyI4V88K9iGdFhiqtRtPDnWdVYiVssAEFQfmoEToiSrIOvLXUW2is+9+IB vRqPZKGPmP8CuqtZoQtjyswspMVJOBGcKdYOQO/gwuG5L/9kaKGXrkKIkygrTuGNGACl AH8tCf1fYHeCzSCF8ESsChzAgApEAf1vRd5Y4= Received: by 10.68.16.232 with SMTP id j8mr1970367pbd.392.1316319779335; Sat, 17 Sep 2011 21:22:59 -0700 (PDT) Received: from [10.0.1.16] (c-24-5-79-250.hsd1.ca.comcast.net. [24.5.79.250]) by mx.google.com with ESMTPS id e3sm51508642pbi.7.2011.09.17.21.22.57 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 17 Sep 2011 21:22:58 -0700 (PDT) From: Ikeda Anthony Mime-Version: 1.0 (Apple Message framework v1244.3) Content-Type: multipart/alternative; boundary="Apple-Mail=_31314A9F-3CD0-4087-BFD1-02CACB379258" Subject: Re: Local Quorum Performance... Date: Sat, 17 Sep 2011 21:23:12 -0700 In-Reply-To: To: user@cassandra.apache.org References: <32A85DEC-AB54-4678-9772-585B95D71812@gmail.com> Message-Id: X-Mailer: Apple Mail (2.1244.3) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_31314A9F-3CD0-4087-BFD1-02CACB379258 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 I'm not sure if it's significant, but on first notice the IP addresses = all have the same octets in the ProperyFileSnitch, yet the EC2Snitch, = all the octets are different. Ergo: PropertyFileSnitch states that are all in the same data centre [168] and = the same rac [2]. EC2Snitch states that all nodes in 3 different data centres [20, 73, = 236]. I'm still new at this too and may not have the full answer as we are = prepping our prod env with the PropertyFileSnitch 2DC's and 3 nodes per = DC. Though our QA environment is configured much the same way only it's = 3 nodes in a single DC: consistency: LOCAL_QUORUM strategy: NetworkTopologyStrategy strategy_options: datacenter1:3 Our distribution is 33% equally. Just reading the docs on the datastax website I'm starting to wonder how = the PropertyFileSnitch distributes the data across the DC's: For NetworkTopologyStrategy, it specifies the number of replicas per = data center in a comma separated list of = datacenter_name:number_of_replicas.=20 I'm wondering if you need to increase your replication factor to 3 to = see the data replicate across the DC's Anthony On 17/09/2011, at 8:36 PM, Chris Marino wrote: > Anthony, We used the Ec2Snitch for one sets of runs, but for another = set we're using PropertyFileSnitch. >=20 > With the PropertyFileSnitch we see: >=20 > Address DC Rack Status State Load = Owns Token =20 >=20 > = 85070591730234615865843651857942052865 =20 >=20 > 192.168.2.1 us-east 1b Up Normal 60.59 MB = 50.00% 0 =20 >=20 > 192.168.2.6 us-west 1c Up Normal 26.5 MB = 0.00% 1 =20 >=20 > 192.168.2.2 us-east 1b Up Normal 29.86 MB = 50.00% 85070591730234615865843651857942052864 =20 >=20 > 192.168.2.7 us-west 1c Up Normal 60.63 MB = 0.00% 85070591730234615865843651857942052865 =20 >=20 >=20 >=20 > While with the EC2Snitch wwe see: > Address DC Rack Status State Load = Owns Token =20 >=20 > = 85070591730234615865843651857942052865 =20 >=20 > 107.20.68.176 us-east 1b Up Normal 59.95 MB = 50.00% 0 =20 >=20 > 204.236.179.193 us-west 1c Up Normal 53.67 MB = 0.00% 1 =20 >=20 > 184.73.133.171 us-east 1b Up Normal 60.65 MB = 50.00% 85070591730234615865843651857942052864 =20 >=20 > 204.236.166.4 us-west 1c Up Normal 26.33 MB = 0.00% 85070591730234615865843651857942052865 =20 >=20 >=20 >=20 > What also strange is that the Load on the nodes changes as well. For = example, node 204.236.166.4 sometimes is very low (~26KB), other times = its closer to 30MB. We see the same kind of variability in both = clusters. >=20 >=20 > For both clusters, we're running stress tests with the following = options: >=20 >=20 > --consistency-level=3DLOCAL_QUORUM --threads=3D4 = --replication-strategy=3DNetworkTopologyStrategy = --strategy-properties=3Dus-east:2,us-west:2 --column-size=3D128 = --keep-going --num-keys=3D100000 -r >=20 > Any clues to what is going on here are greatly appreciated. >=20 > Thanks > CM >=20 > On Sat, Sep 17, 2011 at 12:15 PM, Ikeda Anthony = wrote: > What snitch do you have configured? We typically see a proper spread = of data across all our nodes equally. >=20 > Anthony >=20 >=20 > On 17/09/2011, at 10:06 AM, Chris Marino wrote: >=20 >> Hi, I have a question about what to expect when running a cluster = across datacenters with Local Quorum consistency. >>=20 >> My simplistic assumption is that the performance of an 8 node cluster = split across 2 data centers and running with local quorum would perform = roughly the same as a 4 node cluster in one data center. >>=20 >> I'm 95% certain we've set up the keyspace so that the entire range is = in one datacenter and the client is local. I see the keyspace split = across all the local nodes, with remote nodes owning 0%. Yet when I run = the stress tests against this configuration with local quorum, I see = dramatically different results from when I ran the same tests against a = 4 node cluster. I'm still 5% unsure of this because the documentation = on how to configure this is pretty thin. >>=20 >> My understanding of Local Quorum was that once the data was written = to a local quorum, the commit would complete. I also believed that this = would eliminate any WAN latency required for replication to the other = DC. >>=20 >> It not just that the split cluster runs slower, its also that there = is enormous variability in identical tests. Sometimes by a factor of 2 = or more. It seems as though the WAN latency is not only impacting = performance, but that it's also introducing a wide variation on overally = performance. >>=20 >> Should WAN latency be completely hidden with local quorum? Or are = there second order issues involved that will impact performance?? >>=20 >> I'm running in EC2 across us-east/west regions. I already know how = unpredictable EC2 performance can be, but what I'm seeing with here is = far beyond normal.performance variability for EC2 >>=20 >> Is there something obvious that I'm missing that would explain why = the results are so different??=20 >>=20 >> Here's the config when we run a 2x2 cluster: >>=20 >> Address DC Rack Status State Load = Owns Token =20 >> = 85070591730234615865843651857942052865 =20 >> 192.168.2.1 us-east 1b Up Normal 25.26 MB = 50.00% 0 =20 >> 192.168.2.6 us-west 1c Up Normal 12.68 MB = 0.00% 1 =20 >> 192.168.2.2 us-east 1b Up Normal 12.56 MB = 50.00% 85070591730234615865843651857942052864 =20 >> 192.168.2.7 us-west 1c Up Normal 25.48 MB = 0.00% 85070591730234615865843651857942052865 =20 >>=20 >> Thanks in advance. >> CM >=20 >=20 --Apple-Mail=_31314A9F-3CD0-4087-BFD1-02CACB379258 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 I'm = not sure if it's significant, but on first notice the IP addresses all = have the same octets in the ProperyFileSnitch, yet the EC2Snitch, all = the octets are = different.

Ergo:
PropertyFileSnitch states = that are all in the same data centre [168] and the same rac = [2].
EC2Snitch states that all nodes in 3 different data = centres [20, 73, 236].

I'm still new at this too = and may not have the full answer as we are prepping our prod env with = the PropertyFileSnitch 2DC's and 3 nodes per DC. Though our QA = environment is configured much the same way only it's 3 nodes in a = single DC:

consistency: = LOCAL_QUORUM
strategy: = NetworkTopologyStrategy
strategy_options: = datacenter1:3

Our distribution is 33% = equally.

Just reading the docs on the datastax = website I'm starting to wonder how the PropertyFileSnitch distributes = the data across the DC's:
For NetworkTopologyStrategy, it specifies the = number of replicas per data center in a comma separated = list of:number_of_replicas.
Anthony, = We used the Ec2Snitch for one sets of runs, but for another set we're = using PropertyFileSnitch.

With the PropertyFileSnitch = we see:

Address DC Rack Status State Load = Owns Token
While with the EC2Snitch wwe see:
Address DC = Rack Status State Load Owns Token =
What also strange is that the Load on the nodes changes as well. For = example, node 204.236.166.4 sometimes is very low (~26KB), other times = its closer to 30MB. We see the same kind of variability in both = clusters.

For = both clusters, we're running stress tests with the following = options:
--consistency-level=3DLOCAL_QUORUM= --threads=3D4 --replication-strategy=3DNetworkTopologyStrategy = --strategy-properties=3Dus-east:2,us-west:2 --column-size=3D128 = --keep-going --num-keys=3D100000 -r

Any = clues to what is going on here are greatly = appreciated.
Thanks
CM

On Sat, Sep 17, 2011 at 12:15 PM, Ikeda Anthony <anthony.ikeda.dev@gmail.com> wrote:
What snitch do you have configured? = We typically see a proper spread of data across all our nodes = equally.

Anthony


On 17/09/2011, at 10:06 AM, Chris = Marino wrote:

Hi, I have a question = about what to expect when running a cluster across datacenters with = Local Quorum consistency.

My simplistic assumption is that the performance of an 8 = node cluster split across 2 data centers and running with local quorum = would perform roughly the same as a 4 node cluster in one data = center.

I'm 95% certain we've set up the keyspace so that = the entire range is in one datacenter and the client is local. = I see the keyspace split across all the local nodes, with remote nodes = owning 0%. Yet when I run the stress tests against this configuration = with local quorum, I see dramatically different results from when I ran = the same tests against a 4 node cluster.  I'm still 5% unsure of = this because the documentation on how to configure this is pretty = thin.

My understanding of Local Quorum was that once the = data was written to a local quorum, the commit would complete. I = also believed that this would eliminate any WAN latency = required for replication to the other DC.

It not just that the split cluster runs slower, its = also that there is enormous variability in identical tests. = Sometimes by a factor of 2 or more. It seems as though the WAN latency = is not only impacting performance, but that it's also introducing a wide = variation on overally performance.

Should WAN latency be completely hidden = with local quorum? Or are there second order issues involved that will = impact performance??

I'm running in EC2 across = us-east/west regions. I already know how unpredictable = EC2 performance can be, but what I'm seeing with here is far = beyond normal.performance variability for EC2

Is there something obvious that I'm missing that = would explain why the results are so = different?? 

Here's the config when we run = a 2x2 cluster:

Address         DC   =
       Rack        Status State   Load            Owns    Token          =
                            =20
                                                                         =
      85070591730234615865843651857942052865     =20
192.168.2.1     us-east     1b          Up     Normal  25.26 MB        =
50.00%  0                                          =20
192.168.2.6     us-west     1c          Up     Normal  12.68 MB        =
0.00%   1                                          =20
192.168.2.2     us-east     1b          Up     Normal  12.56 MB        =
50.00%  85070591730234615865843651857942052864     =20
192.168.2.7     us-west     1c          Up     Normal  25.48 MB        =
0.00%   85070591730234615865843651857942052865      =

Thanks in = advance.
CM
=



= --Apple-Mail=_31314A9F-3CD0-4087-BFD1-02CACB379258--