Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 376271073B for ; Tue, 11 Jun 2013 21:34:45 +0000 (UTC) Received: (qmail 41031 invoked by uid 500); 11 Jun 2013 21:34:42 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 40984 invoked by uid 500); 11 Jun 2013 21:34:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 40976 invoked by uid 99); 11 Jun 2013 21:34:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Jun 2013 21:34:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of daning@netseer.com designates 209.85.217.174 as permitted sender) Received: from [209.85.217.174] (HELO mail-lb0-f174.google.com) (209.85.217.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Jun 2013 21:34:36 +0000 Received: by mail-lb0-f174.google.com with SMTP id x10so5935441lbi.19 for ; Tue, 11 Jun 2013 14:34:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=VDPlCxsLJeY9jxLSOQ/UZKLFLK24f3xKN6t19eh/bro=; b=T5JIJjRKDdGo3ysClQQ/HknK81izW7JoC19gHJmH3U9s6NLrham1+kzK0c/Hd2iUkm oD/q7BB3XgJyn0mg/XGO0BwV28in0PysHG9l5D0EDk1mO1Ejm5TCevgNKCCAGzq5Sotl uUHIzyW7FbHBimspTi8dFP94RHTcw4Y6kfqF/TbCcdpBO59NyasW2rSB3CPdBlkQG0c+ qvjh7o2FAfxGmZ/uO6b9we3BrWHPrMl/maNxX+ppY3Ap/4CQIWE2CVHLz4jRfpDpqavA F1rLjYtOAH6n/0SUtRutLznqFUacROVmgQGxY8GlSg295mfZK2rO4Sxk31ANsTwvg2QX JFWw== MIME-Version: 1.0 X-Received: by 10.112.181.2 with SMTP id ds2mr9634786lbc.0.1370986456114; Tue, 11 Jun 2013 14:34:16 -0700 (PDT) Received: by 10.114.20.7 with HTTP; Tue, 11 Jun 2013 14:34:16 -0700 (PDT) In-Reply-To: References: Date: Tue, 11 Jun 2013 14:34:16 -0700 Message-ID: Subject: Re: Multiple data center performance From: Daning Wang To: user@cassandra.apache.org, comomore@gmail.com Content-Type: multipart/alternative; boundary=001a11c376aab57b4604dee7a875 X-Gm-Message-State: ALoCoQlRe8edIeLPXvCbiJPg+XO+NX+QvJH9wnA9faFJ6if9wMmPRnixDDi/DN15AmAdA67HOrwm X-Virus-Checked: Checked by ClamAV on apache.org --001a11c376aab57b4604dee7a875 Content-Type: text/plain; charset=ISO-8859-1 It is counter caused the problem. counter will replicate to all replicas during write regardless the consistency level. In our case. we don't need to sync the counter across the center. so moving counter to new keyspace and all the replica in one center solved problem. There is option replicate_on_write on table. If you turn that off for counter might have better performance. but you are on high risk to lose data and create inconsistency. I did not try this option. Daning On Sat, Jun 8, 2013 at 6:53 AM, srmore wrote: > I am seeing the similar behavior, in my case I have 2 nodes in each > datacenter and one node always has high latency (equal to the latency > between the two datacenters). When one of the datacenters is shutdown the > latency drops. > > I am curious to know whether anyone else has these issues and if yes how > did to get around it. > > Thanks ! > > > On Fri, Jun 7, 2013 at 11:49 PM, Daning Wang wrote: > >> We have deployed multi-center but got performance issue. When the nodes >> on other center are up, the read response time from clients is 4 or 5 times >> higher. when we take those nodes down, the response time becomes >> normal(compare to the time before we changed to multi-center). >> >> We have high volume on the cluster, the consistency level is one for >> read. so my understanding is most of traffic between data center should be >> read repair. but seems that could not create much delay. >> >> What could cause the problem? how to debug this? >> >> Here is the keyspace, >> >> [default@dsat] describe dsat; >> Keyspace: dsat: >> Replication Strategy: >> org.apache.cassandra.locator.NetworkTopologyStrategy >> Durable Writes: true >> Options: [dc2:1, dc1:3] >> Column Families: >> ColumnFamily: categorization_cache >> >> >> Ring >> >> Datacenter: dc1 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> UN xx.xx.xx..111 59.2 GB 256 37.5% >> 4d6ed8d6-870d-4963-8844-08268607757e rac1 >> DN xx.xx.xx..121 99.63 GB 256 37.5% >> 9d0d56ce-baf6-4440-a233-ad6f1d564602 rac1 >> UN xx.xx.xx..120 66.32 GB 256 37.5% >> 0fd912fb-3187-462b-8c8a-7d223751b649 rac1 >> UN xx.xx.xx..118 63.61 GB 256 37.5% >> 3c6e6862-ab14-4a8c-9593-49631645349d rac1 >> UN xx.xx.xx..117 68.16 GB 256 37.5% >> ee6cdf23-d5e4-4998-a2db-f6c0ce41035a rac1 >> UN xx.xx.xx..116 32.41 GB 256 37.5% >> f783eeef-1c51-4f91-ab7c-a60669816770 rac1 >> UN xx.xx.xx..115 64.24 GB 256 37.5% >> e75105fb-b330-4f40-aa4f-8e6e11838e37 rac1 >> UN xx.xx.xx..112 61.32 GB 256 37.5% >> 2547ee54-88dd-4994-a1ad-d9ba367ed11f rac1 >> Datacenter: dc2 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> DN xx.xx.xx.199 58.39 GB 256 50.0% >> 6954754a-e9df-4b3c-aca7-146b938515d8 rac1 >> DN xx.xx.xx..61 33.79 GB 256 50.0% >> 91b8d510-966a-4f2d-a666-d7edbe986a1c rac1 >> >> >> Thank you in advance, >> >> Daning >> >> > --001a11c376aab57b4604dee7a875 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
It is counter caused the problem. counter will replicate t= o all replicas during write regardless the consistency level.=A0

In our case. we don't need to sync the counter across th= e center. so moving counter to new keyspace and all the replica in one cent= er=A0solved=A0problem.

There is option=A0replicate_on_write on tab= le. If you turn that off for counter might have better performance. but you= are on high risk to lose data and create inconsistency. I did not try this= option.

Daning


On Sat, Jun 8, 2013 at 6:53 AM, srmore= <comomore@gmail.com> wrote:
I am seeing the s= imilar behavior, in my case I have 2 nodes in each datacenter and one node = always has high latency (equal to the latency between the two datacenters).= When one of the datacenters is shutdown the latency drops.

I am curious to know whether anyone else has these issues and if = yes how did to get around it.

Thanks !


On Fri, Jun 7, 2013 at 11:49 PM, Daning Wang <daning@netseer.com>= wrote:
We have deployed multi-cent= er but got performance issue. When the nodes on other center are up, the re= ad response time from clients is 4 or 5 times higher. when we take those no= des down, the response time becomes normal(compare to the time before we ch= anged to multi-center).

We have high volume on the cluster, the consistency level is= one for read. so my understanding is most of traffic between data center s= hould be read repair. but seems that could not create much delay.

What could cause the problem? how to debug this?
<= div>
Here is the keyspace,

[def= ault@dsat] describe dsat;
Keyspace: dsat:
=A0 Replication Strategy: org.apache.cassandra.lo= cator.NetworkTopologyStrategy
=A0 Durable Writes: true
= =A0 =A0 Options: [dc2:1, dc1:3]
=A0 Column Families:
= =A0 =A0 ColumnFamily: categorization_cache
=A0

Ring

Datace= nter: dc1
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/Moving
-- =A0Address =A0 =A0 =A0 =A0 =A0 Load =A0 =A0 =A0 Tokens =A0Owns (= effective) =A0Host ID =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 Rack
UN =A0xx.xx.xx..111 =A0 =A0 =A0 59.2 GB =A0 =A0256 =A0 =A0 37.5% =A0 = =A0 =A0 =A0 =A0 =A0 4d6ed8d6-870d-4963-8844-08268607757e =A0rac1
= DN =A0xx.xx.xx..121 =A0 =A0 =A0 99.63 GB =A0 256 =A0 =A0 37.5% =A0 =A0 =A0 = =A0 =A0 =A0 9d0d56ce-baf6-4440-a233-ad6f1d564602 =A0rac1
UN =A0xx.xx.xx..120 =A0 =A0 =A0 66.32 GB =A0 256 =A0 =A0 37.5% =A0 =A0= =A0 =A0 =A0 =A0 0fd912fb-3187-462b-8c8a-7d223751b649 =A0rac1
UN = =A0xx.xx.xx..118 =A0 =A0 =A0 63.61 GB =A0 256 =A0 =A0 37.5% =A0 =A0 =A0 =A0= =A0 =A0 3c6e6862-ab14-4a8c-9593-49631645349d =A0rac1
UN =A0xx.xx.xx..117 =A0 =A0 =A0 68.16 GB =A0 256 =A0 =A0 37.5% =A0 =A0= =A0 =A0 =A0 =A0 ee6cdf23-d5e4-4998-a2db-f6c0ce41035a =A0rac1
UN = =A0xx.xx.xx..116 =A0 =A0 =A0 32.41 GB =A0 256 =A0 =A0 37.5% =A0 =A0 =A0 =A0= =A0 =A0 f783eeef-1c51-4f91-ab7c-a60669816770 =A0rac1
UN =A0xx.xx.xx..115 =A0 =A0 =A0 64.24 GB =A0 256 =A0 =A0 37.5% =A0 =A0= =A0 =A0 =A0 =A0 e75105fb-b330-4f40-aa4f-8e6e11838e37 =A0rac1
UN = =A0xx.xx.xx..112 =A0 =A0 =A0 61.32 GB =A0 256 =A0 =A0 37.5% =A0 =A0 =A0 =A0= =A0 =A0 2547ee54-88dd-4994-a1ad-d9ba367ed11f =A0rac1
Datacenter: dc2
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
Status=3DUp/Down
|/ State=3DNormal/Leaving/Joining/= Moving
-- =A0Address =A0 =A0 =A0 =A0 =A0 Load =A0 =A0 =A0 Tokens = =A0Owns (effective) =A0Host ID =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 Rack
DN =A0xx.xx.xx.199 =A0 =A058.39 GB =A0 256 =A0 =A0 50.0% =A0 =A0 =A0 = =A0 =A0 =A0 6954754a-e9df-4b3c-aca7-146b938515d8 =A0rac1
DN =A0xx= .xx.xx..61 =A0 =A0 =A033.79 GB =A0 256 =A0 =A0 50.0% =A0 =A0 =A0 =A0 =A0 = =A0 91b8d510-966a-4f2d-a666-d7edbe986a1c =A0rac1


Thank you in advance,
Daning



--001a11c376aab57b4604dee7a875--