Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DB1E0200B53 for ; Tue, 12 Jul 2016 20:18:53 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D977B160A56; Tue, 12 Jul 2016 18:18:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 61628160A53 for ; Tue, 12 Jul 2016 20:18:52 +0200 (CEST) Received: (qmail 85101 invoked by uid 500); 12 Jul 2016 18:18:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85091 invoked by uid 99); 12 Jul 2016 18:18:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jul 2016 18:18:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 5EF9418069A for ; Tue, 12 Jul 2016 18:18:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=kryptoncloud-com.20150623.gappssmtp.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id AGw3yAvhhd3o for ; Tue, 12 Jul 2016 18:18:42 +0000 (UTC) Received: from mail-it0-f47.google.com (mail-it0-f47.google.com [209.85.214.47]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id A95665FBA5 for ; Tue, 12 Jul 2016 18:18:41 +0000 (UTC) Received: by mail-it0-f47.google.com with SMTP id f6so537161ith.1 for ; Tue, 12 Jul 2016 11:18:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kryptoncloud-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=MqV4V+71XKkETPS44FnTFp0GCcZ9f/13iZslqHD/bms=; b=d4whlMxgyY4LJtZ5EjLECS4GvckVHTtcLpPOfW9HBbaX8Z3/QIeRNE0wwUfp5EUerx WwVb82JbPFc81wAXj0zGTx8KaI5yBltOAGdh0k+RAzgGnSE8Udn60cbhxUcbWjx3N4Rg W5ioMt2eFYjyqnLuRgluoFw5vlIm8O4/OemdbcqlT6T8XPJKi3bkz0uZiCTInY0VORF+ bTL5J3hROzI941Y5KhSOqUKHOx6RoDkc5PAR2bcG/2GRqu/XOLxOowbmsMv4xamU69Tv reMcLDeGILmVV1cGcSHiXg9CjiB/BE8+YO1A9Gf3iH/cYtXpMieD01sL4x9R1tyybmjg VF7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=MqV4V+71XKkETPS44FnTFp0GCcZ9f/13iZslqHD/bms=; b=O6/atX29Ihm3WFLEuM6khQyZSGXlScTbx4MUNmGmdxTgtvZEEm4cJc2ZWF5b5xS6Vc KbwbkoJD+EY2KwiECjvCB6DKaPGL175ECCtk+zQ4kDGcYnNNDGI86BLF6IWSfnN33cbA s3hTJvYshcHeFgtfQaXirCnMwMqkiuuXup6goG/C2orNX1CL6OfR8VVWa3/dYWYe7Kx3 oWDUntDfPRgjfpYQAyFha1ojbJXV3NPPrAXk2l/a0utp2NJLw0iTwaZ/M8f2Fm4PIu1a WkhuhgY3OMda9IAxdJZMJc5bg6JKStjFHBL8sCrMp6QH6CGB4on7uRgdfR6seFshUJly 5DjQ== X-Gm-Message-State: ALyK8tKQ3S8YvEJEODgT8IeWp1FKYLHa6RTF+JeICQmThSxRGZpXBdTbnQXAYpQufWMb6nvVcxrrjtQQotvkzA== X-Received: by 10.36.212.130 with SMTP id x124mr4345130itg.65.1468347520099; Tue, 12 Jul 2016 11:18:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.203.34 with HTTP; Tue, 12 Jul 2016 11:18:00 -0700 (PDT) In-Reply-To: References: From: Yuan Fang Date: Tue, 12 Jul 2016 11:18:00 -0700 Message-ID: Subject: Re: Is my cluster normal? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=94eb2c0b0f3a57e7b70537744c57 archived-at: Tue, 12 Jul 2016 18:18:54 -0000 --94eb2c0b0f3a57e7b70537744c57 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Jonathan, The IOs are like below. I am not sure why one node always has a much bigger KB_read/s than other nodes. It seems not good. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D avg-cpu: %user %nice %system %iowait %steal %idle 54.78 24.48 9.35 0.96 0.08 10.35 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvda 2.31 14.64 17.95 1415348 1734856 xvdf 252.68 11789.51 6394.15 1139459318 617996710 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D avg-cpu: %user %nice %system %iowait %steal %idle 22.71 6.57 3.96 0.50 0.19 66.07 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvda 1.12 3.63 10.59 3993540 11648848 xvdf 68.20 923.51 2526.86 1016095212 2780187819 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D avg-cpu: %user %nice %system %iowait %steal %idle 22.31 8.08 3.70 0.26 0.23 65.42 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvda 1.07 2.87 10.89 3153996 11976704 xvdf 34.48 498.21 2293.70 547844196 2522227746 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D avg-cpu: %user %nice %system %iowait %steal %idle 22.75 8.13 3.82 0.36 0.21 64.73 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvda 1.10 3.20 11.33 3515752 12442344 xvdf 44.45 474.30 2511.71 520758840 2757732583 On Thu, Jul 7, 2016 at 6:54 PM, Jonathan Haddad wrote: > What's your CPU looking like? If it's low, check your IO with iostat or > dstat. I know some people have used Ebs and say it's fine but ive been > burned too many times. > On Thu, Jul 7, 2016 at 6:12 PM Yuan Fang wrote: > >> Hi Riccardo, >> >> Very low IO-wait. About 0.3%. >> No stolen CPU. It is a casssandra only instance. I did not see any >> dropped messages. >> >> >> ubuntu@cassandra1:/mnt/data$ nodetool tpstats >> Pool Name Active Pending Completed Blocked >> All time blocked >> MutationStage 1 1 929509244 0 >> 0 >> ViewMutationStage 0 0 0 0 >> 0 >> ReadStage 4 0 4021570 0 >> 0 >> RequestResponseStage 0 0 731477999 0 >> 0 >> ReadRepairStage 0 0 165603 0 >> 0 >> CounterMutationStage 0 0 0 0 >> 0 >> MiscStage 0 0 0 0 >> 0 >> CompactionExecutor 2 55 92022 0 >> 0 >> MemtableReclaimMemory 0 0 1736 0 >> 0 >> PendingRangeCalculator 0 0 6 0 >> 0 >> GossipStage 0 0 345474 0 >> 0 >> SecondaryIndexManagement 0 0 0 0 >> 0 >> HintsDispatcher 0 0 4 0 >> 0 >> MigrationStage 0 0 35 0 >> 0 >> MemtablePostFlush 0 0 1973 0 >> 0 >> ValidationExecutor 0 0 0 0 >> 0 >> Sampler 0 0 0 0 >> 0 >> MemtableFlushWriter 0 0 1736 0 >> 0 >> InternalResponseStage 0 0 5311 0 >> 0 >> AntiEntropyStage 0 0 0 0 >> 0 >> CacheCleanupExecutor 0 0 0 0 >> 0 >> Native-Transport-Requests 128 128 347508531 2 >> 15891862 >> >> Message type Dropped >> READ 0 >> RANGE_SLICE 0 >> _TRACE 0 >> HINT 0 >> MUTATION 0 >> COUNTER_MUTATION 0 >> BATCH_STORE 0 >> BATCH_REMOVE 0 >> REQUEST_RESPONSE 0 >> PAGED_RANGE 0 >> READ_REPAIR 0 >> >> >> >> >> >> On Thu, Jul 7, 2016 at 5:24 PM, Riccardo Ferrari >> wrote: >> >>> Hi Yuan, >>> >>> You machine instance is 4 vcpus that is 4 threads (not cores!!!), aside >>> from any Cassandra specific discussion a system load of 10 on a 4 threa= ds >>> machine is way too much in my opinion. If that is the running average >>> system load I would look deeper into system details. Is that IO wait? I= s >>> that CPU Stolen? Is that a Cassandra only instance or are there other >>> processes pushing the load? >>> What does your "nodetool tpstats" say? Hoe many dropped messages do you >>> have? >>> >>> Best, >>> >>> On Fri, Jul 8, 2016 at 12:34 AM, Yuan Fang >>> wrote: >>> >>>> Thanks Ben! For the post, it seems they got a little better but simila= r >>>> result than i did. Good to know it. >>>> I am not sure if a little fine tuning of heap memory will help or not. >>>> >>>> >>>> On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater >>>> wrote: >>>> >>>>> Hi Yuan, >>>>> >>>>> You might find this blog post a useful comparison: >>>>> >>>>> https://www.instaclustr.com/blog/2016/01/07/multi-data-center-apache-= spark-and-apache-cassandra-benchmark/ >>>>> >>>>> Although the focus is on Spark and Cassandra and multi-DC there are >>>>> also some single DC benchmarks of m4.xl clusters plus some discussion= of >>>>> how we went about benchmarking. >>>>> >>>>> Cheers >>>>> Ben >>>>> >>>>> >>>>> On Fri, 8 Jul 2016 at 07:52 Yuan Fang wrote: >>>>> >>>>>> Yes, here is my stress test result: >>>>>> Results: >>>>>> op rate : 12200 [WRITE:12200] >>>>>> partition rate : 12200 [WRITE:12200] >>>>>> row rate : 12200 [WRITE:12200] >>>>>> latency mean : 16.4 [WRITE:16.4] >>>>>> latency median : 7.1 [WRITE:7.1] >>>>>> latency 95th percentile : 38.1 [WRITE:38.1] >>>>>> latency 99th percentile : 204.3 [WRITE:204.3] >>>>>> latency 99.9th percentile : 465.9 [WRITE:465.9] >>>>>> latency max : 1408.4 [WRITE:1408.4] >>>>>> Total partitions : 1000000 [WRITE:1000000] >>>>>> Total errors : 0 [WRITE:0] >>>>>> total gc count : 0 >>>>>> total gc mb : 0 >>>>>> total gc time (s) : 0 >>>>>> avg gc time(ms) : NaN >>>>>> stdev gc time(ms) : 0 >>>>>> Total operation time : 00:01:21 >>>>>> END >>>>>> >>>>>> On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla wrote: >>>>>> >>>>>>> Lots of variables you're leaving out. >>>>>>> >>>>>>> Depends on write size, if you're using logged batch or not, what >>>>>>> consistency level, what RF, if the writes come in bursts, etc, etc. >>>>>>> However, that's all sort of moot for determining "normal" really yo= u need a >>>>>>> baseline as all those variables end up mattering a huge amount. >>>>>>> >>>>>>> I would suggest using Cassandra stress as a baseline and go from >>>>>>> there depending on what those numbers say (just pick the defaults). >>>>>>> >>>>>>> Sent from my iPhone >>>>>>> >>>>>>> On Jul 7, 2016, at 4:39 PM, Yuan Fang wrote= : >>>>>>> >>>>>>> yes, it is about 8k writes per node. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle < >>>>>>> daemeonr@gmail.com> wrote: >>>>>>> >>>>>>>> Are you saying 7k writes per node? or 30k writes per node? >>>>>>>> >>>>>>>> >>>>>>>> *.......* >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 >>>>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872 >>>>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>* >>>>>>>> >>>>>>>> On Thu, Jul 7, 2016 at 2:05 PM, Yuan Fang >>>>>>>> wrote: >>>>>>>> >>>>>>>>> writes 30k/second is the main thing. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle < >>>>>>>>> daemeonr@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Assuming you meant 100k, that likely for something with 16mb of >>>>>>>>>> storage (probably way small) where the data is more that 64k hen= ce will not >>>>>>>>>> fit into the row cache. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *.......* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 >>>>>>>>>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872 >>>>>>>>>> <%28%2B44%29%20%280%29%2020%208144%209872>* >>>>>>>>>> >>>>>>>>>> On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory >>>>>>>>>>> and 600GB ssd EBS). >>>>>>>>>>> I can reach a cluster wide write requests of 30k/second and rea= d >>>>>>>>>>> request about 100/second. The cluster OS load constantly above = 10. Are >>>>>>>>>>> those normal? >>>>>>>>>>> >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Yuan >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> -- >>>>> =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80= =94 >>>>> Ben Slater >>>>> Chief Product Officer >>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support >>>>> +61 437 929 798 >>>>> >>>> >>>> >>> >> --94eb2c0b0f3a57e7b70537744c57 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Jonathan,

The IOs are like below. I = am not sure why one node always has a much bigger KB_read/s than other node= s. It seems not good.


=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
avg-cpu: =C2=A0%user =C2=A0 %nice = %system %iowait =C2=A0%steal =C2=A0 %idle
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 54.78 =C2=A0 24.48 =C2=A0 =C2=A09.35 =C2=A0 =C2=A00.96 =C2=A0= =C2=A00.08 =C2=A0 10.35

Device: =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0tps =C2=A0 =C2=A0kB_read/s =C2=A0 =C2=A0kB_wrtn/s = =C2=A0 =C2=A0kB_read =C2=A0 =C2=A0kB_wrtn
xvda =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02.31 =C2=A0 =C2=A0 =C2=A0 =C2=A014.64 =C2=A0= =C2=A0 =C2=A0 =C2=A017.95 =C2=A0 =C2=A01415348 =C2=A0 =C2=A01734856
<= div>xvdf =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0252.68 =C2=A0 =C2=A0 1178= 9.51 =C2=A0 =C2=A0 =C2=A06394.15 1139459318 =C2=A0617996710

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

=
avg-cpu: =C2=A0%user =C2=A0 %nice %system %iowait =C2=A0%steal = =C2=A0 %idle
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 22.71 =C2=A0 =C2= =A06.57 =C2=A0 =C2=A03.96 =C2=A0 =C2=A00.50 =C2=A0 =C2=A00.19 =C2=A0 66.07<= /div>

Device: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0t= ps =C2=A0 =C2=A0kB_read/s =C2=A0 =C2=A0kB_wrtn/s =C2=A0 =C2=A0kB_read =C2= =A0 =C2=A0kB_wrtn
xvda =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A01.12 =C2=A0 =C2=A0 =C2=A0 =C2=A0 3.63 =C2=A0 =C2=A0 =C2=A0 =C2=A010.5= 9 =C2=A0 =C2=A03993540 =C2=A0 11648848
xvdf =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 68.20 =C2=A0 =C2=A0 =C2=A0 923.51 =C2=A0 =C2=A0 =C2=A0= 2526.86 1016095212 2780187819

=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
avg-cpu: =C2=A0%user =C2= =A0 %nice %system %iowait =C2=A0%steal =C2=A0 %idle
=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 22.31 =C2=A0 =C2=A08.08 =C2=A0 =C2=A03.70 =C2=A0 =C2= =A00.26 =C2=A0 =C2=A00.23 =C2=A0 65.42

Device: =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0tps =C2=A0 =C2=A0kB_read/s =C2=A0 =C2= =A0kB_wrtn/s =C2=A0 =C2=A0kB_read =C2=A0 =C2=A0kB_wrtn
xvda =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01.07 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 2.87 =C2=A0 =C2=A0 =C2=A0 =C2=A010.89 =C2=A0 =C2=A03153996 =C2=A0 11976= 704
xvdf =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 34.48 =C2=A0 = =C2=A0 =C2=A0 498.21 =C2=A0 =C2=A0 =C2=A02293.70 =C2=A0547844196 2522227746=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D
avg-cpu: =C2=A0%user =C2=A0 %nice %system %iowait =C2= =A0%steal =C2=A0 %idle
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 22.75 = =C2=A0 =C2=A08.13 =C2=A0 =C2=A03.82 =C2=A0 =C2=A00.36 =C2=A0 =C2=A00.21 =C2= =A0 64.73

Device: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0tps =C2=A0 =C2=A0kB_read/s =C2=A0 =C2=A0kB_wrtn/s =C2=A0 =C2=A0kB= _read =C2=A0 =C2=A0kB_wrtn
xvda =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A01.10 =C2=A0 =C2=A0 =C2=A0 =C2=A0 3.20 =C2=A0 =C2=A0 =C2=A0= =C2=A011.33 =C2=A0 =C2=A03515752 =C2=A0 12442344
xvdf =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 44.45 =C2=A0 =C2=A0 =C2=A0 474.30 =C2=A0 = =C2=A0 =C2=A02511.71 =C2=A0520758840 2757732583

<= div>




On Thu, Jul 7, 2016 a= t 6:54 PM, Jonathan Haddad <jon@jonhaddad.com> wrote:
What's your CPU looking like? If it's l= ow, check your IO with iostat or dstat. I know some people have used Ebs an= d say it's fine but ive been burned too many times.
On Thu= , Jul 7, 2016 at 6:12 PM Yuan Fang <yuan@kryptoncloud.com> wrote:
Hi Riccardo,

V= ery low IO-wait. About 0.3%.
No stolen CPU. It is a casssandra on= ly instance. I did not see any dropped messages.

<= br>
ubuntu@cassandra1:/mnt/data$ nodetool tpstats
= Pool Name =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0Active =C2=A0 Pending =C2=A0 =C2=A0 =C2=A0Completed =C2=A0 Blocked = =C2=A0All time blocked
MutationStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 = =C2=A0 =C2=A0 =C2=A0929509244 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
ViewMutationStage =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
ReadStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 4 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 = =C2=A0 =C2=A04021570 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
RequestResponseStage =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 =C2=A0731477999 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
ReadRepairStage =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 165603 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
Coun= terMutationStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 0
MiscStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
C= ompactionExecutor =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A02 = =C2=A0 =C2=A0 =C2=A0 =C2=A055 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A092022 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 0
MemtableReclaimMemory =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 1736 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0
PendingRangeCalculator =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A06 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
GossipStage =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 345474 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
SecondaryIndexManagement =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0
HintsDispatcher =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A04 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
Migra= tionStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 35 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0
MemtablePostFlush =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 1973 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
ValidationExecutor = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0
Sampler =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
Memta= bleFlushWriter =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1736 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 0
InternalResponseStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 5311= =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 0
AntiEntropyStage =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
CacheClean= upExecutor =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
Native-Transport-Requests =C2=A0 =C2=A0 =C2=A0 128 =C2=A0 = =C2=A0 =C2=A0 128 =C2=A0 =C2=A0 =C2=A0347508531 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 2 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A015891862

Mess= age type =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Dropped
READ =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
RANGE_SLICE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A00
_TRACE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
HINT =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0
<= div>MUTATION =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 0
COUNTER_MUTATION =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 0
BATCH_STORE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A00
BATCH_REMOVE =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 0
REQUEST_RESPONSE =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 0
PAGED_RANGE =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00
READ_REPAIR =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00





On Thu, Jul 7, 2016 at 5:2= 4 PM, Riccardo Ferrari <ferrarir@gmail.com> wrote:
Hi Yuan,

You = machine instance is 4 vcpus that is 4 threads (not cores!!!), aside from an= y Cassandra specific discussion a system load of 10 on a 4 threads machine = is way too much in my opinion. If that is the running average system load I= would look deeper into system details. Is that IO wait? Is that CPU Stolen= ? Is that a Cassandra only instance or are there other processes pushing th= e load?
What does your "nodetool tpstats" say? Hoe many= dropped messages do you have?

Best,

On Fri, J= ul 8, 2016 at 12:34 AM, Yuan Fang <yuan@kryptoncloud.com> wrote:
Thanks Ben! For= the post, it seems they got a little better but similar result than i did.= Good to know it.
I am not sure if a little fine tuning of heap memory = will help or not.=C2=A0


On Thu, Jul 7, 2016 at 2:58 PM, Ben Slater <b= en.slater@instaclustr.com> wrote:
Hi Yuan,

You might find this blo= g post a useful comparison:
=
Although the focus is on Spark and Cassandra and multi-DC th= ere are also some single DC benchmarks of m4.xl clusters plus some discussi= on of how we went about benchmarking.

Cheers
=
Ben


On Fri, 8 Jul 2016 at 07:52 Yuan Fang <yuan@kryptoncloud.com> w= rote:
Yes, here is= my stress test result:
Results:
op rate =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : 12200 [WRITE:12200]
partition rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 12200 [WRIT= E:12200]
row rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0: 12200 [WRITE:12200]
latency mean =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 16.4 [WRITE:16.4]
latency me= dian =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 7.1 [WRITE:7.1]
l= atency 95th percentile =C2=A0 : 38.1 [WRITE:38.1]
latency 99th pe= rcentile =C2=A0 : 204.3 [WRITE:204.3]
latency 99.9th percentile := 465.9 [WRITE:465.9]
latency max =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 : 1408.4 [WRITE:1408.4]
Total partitions =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0: 1000000 [WRITE:1000000]
Total error= s =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 0 [WRITE:0]
t= otal gc count =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0: 0
total = gc mb =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : 0
total = gc time (s) =C2=A0 =C2=A0 =C2=A0 =C2=A0 : 0
avg gc time(ms) =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 : NaN
stdev gc time(ms) =C2=A0 = =C2=A0 =C2=A0 =C2=A0 : 0
Total operation time =C2=A0 =C2=A0 =C2= =A0: 00:01:21
END
On Thu, Jul 7, 2016 at 2:49 PM, Ryan Svihla <rs@f= oundev.pro> wrote:
Lots of variables you're leaving out.

Depends on write size, if you're using logged batch or not, what= consistency level, what RF, if the writes come in bursts, etc, etc. Howeve= r, that's all sort of moot for determining "normal" really yo= u need a baseline as all those variables end up mattering a huge amount.

I would suggest using Cassandra stress as a baseline= and go from there depending on what those numbers say (just pick the defau= lts).

Sent from my iPhone

On Jul 7, 2016, at= 4:39 PM, Yuan Fang <yuan@kryptoncloud.com> wrote:

yes, it is about 8k writes per node.


On Thu, Jul 7, 2016 at 2:18 PM, daemeon reiydelle <daemeonr@g= mail.com> wrote:
Are you saying 7k writes per node? or 30k writ= es per node?


.......
=
=

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
Lond= on (+44) (0) 20 8144 9872

<= /div>
On Thu, Jul 7, 2016 at 2:05= PM, Yuan Fang <yuan@kryptoncloud.com> wrote:
writes 30k/second is the main thin= g.


On Thu, Jul 7, 2016 at 1:51 PM, daemeon reiydelle <daeme= onr@gmail.com> wrote:
Assuming you meant 100k, that likely for so= mething with 16mb of storage (probably way small) where the data is more th= at 64k hence will not fit into the row cache.

= =
.......


Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 = 9872

<= /div>

On Thu, Jul 7, 2016 at 1:25 PM, Yuan Fang <yuan@kryptoncloud.com> wrote:


I have a cluster of 4 m4.xlarge nodes(4 cpus and 16 gb memory and 600GB ss= d EBS).
I can reach a cluster wide write requests of 30k/second and rea= d request about 100/second. The cluster OS load constantly above 10. Are th= ose normal?

Thanks!


<= /div>
Best,

Yuan=C2=A0






--
= =E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94=E2=80=94Ben Slater
Chief Product Officer
Instaclustr: Cassandra + S= park - Managed | Consulting | Support




--94eb2c0b0f3a57e7b70537744c57--