Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B1114200D3D for ; Mon, 13 Nov 2017 23:51:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id AF8C5160BF3; Mon, 13 Nov 2017 22:51:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A9384160BF0 for ; Mon, 13 Nov 2017 23:51:06 +0100 (CET) Received: (qmail 42423 invoked by uid 500); 13 Nov 2017 22:51:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42413 invoked by uid 99); 13 Nov 2017 22:51:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Nov 2017 22:51:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 26C2D18072C for ; Mon, 13 Nov 2017 22:51:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=amplitude.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id uC3IwbjndiV0 for ; Mon, 13 Nov 2017 22:51:02 +0000 (UTC) Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C9CF95FC64 for ; Mon, 13 Nov 2017 22:51:01 +0000 (UTC) Received: by mail-qt0-f173.google.com with SMTP id f8so21657383qta.5 for ; Mon, 13 Nov 2017 14:51:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amplitude.com; s=google; h=mime-version:from:date:message-id:subject:to; bh=nrUSR9wA0orkvzYuHlzpcNy2t0Lqt1rgno5YxJZPwVE=; b=V1/IPp4IeFgnxhBjhy0HxWHSkGfYmXO7MMHmkreHCBxZSrohIy+gc+Vjg33qtVK1Td m4Q2npV8NhNUQ8UWS5q20ODWdx59/gmg3uE7GnVkkdgRBSS5YNS1gftiiGsemiZxYI67 hw/vszKw+xvtvrSsy4tSyuiNEFEjxJN6VS7uTrUSOo/gZ+X/OmAb8hNwH10dA+m9hj3g Smz63e6YDpNVa0OTHcjftHWiz/AkSoymOwgtvvczDNtZbakdN64D2MwpX616xRW/qT1Q P0URMhlAt5yHqeaKG4tMQdhh2x/VqG1iTKgsaAzmL612kAy8P5d1Y0E9ygBJnuZ3Gf7j nzOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=nrUSR9wA0orkvzYuHlzpcNy2t0Lqt1rgno5YxJZPwVE=; b=SoZ5VQDq1Pi/QeFkjgxkJehswTVwU8yznvfffUo6CT/6e3dYgFzQ2shDA+REjFXIoa ydGyNNA0JJzVSOKVBqM/ofVG4rAEhWPDkWQoSjwpyeVPbUM2hCfRe+ZfQH4jHI1/c9DL U1ARwzBluBcvApEfUHAaVJj74vO+pscCnU7Nvw2UyuqYeJ9j4xiBIvodMvmfQqq+2lXN 3M70YmakrtTGdi1NBW2Jklxj6P8IfENGJiBjX0mtogM9qq4Vwl0ojIhXA6MIjkukRlWs IfOwzscVPBbfBRy/7A5dy2zvsEIrJir6mEn/2WKz7xsVgZD/ktClGa87fbd5Wm+tL42i QDpA== X-Gm-Message-State: AJaThX7fCwM9u7WG05oZM5aTVuSf5b3EfWerfs7kYMf8zC/bFNqeIj2r tAsJgHbvxDy4rWHlNmPOjYFKtcqdAaQTkP1R72V1Xa6g X-Google-Smtp-Source: AGs4zMZhhjmYES8j80miU0ZkyMWISacy3fKgNtr8dZJXBX3u2SxTxRmNhPVDK1iQHSn5elLipCmyN3RgfJizUL80L4o= X-Received: by 10.237.62.4 with SMTP id l4mr8199973qtf.331.1510613460538; Mon, 13 Nov 2017 14:51:00 -0800 (PST) MIME-Version: 1.0 Received: by 10.237.60.238 with HTTP; Mon, 13 Nov 2017 14:51:00 -0800 (PST) From: Kurtis Norwood Date: Mon, 13 Nov 2017 14:51:00 -0800 Message-ID: Subject: High IO Util using TimeWindowCompaction To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="001a1140e474b568d3055de51ad4" archived-at: Mon, 13 Nov 2017 22:51:07 -0000 --001a1140e474b568d3055de51ad4 Content-Type: text/plain; charset="UTF-8" I've been testing out cassandra 3.11 (currently using 3.7) and have been observing really high io util occasionally that sometimes results in temporary flatlining at 100% io util for an extended period. I think my use case is pretty simple and currently only testing part of it on this new version so looking for advice on what might be going wrong. Use Case: I am using cassandra as basically a large "set", my table schema is incredibly simple, just a primary key. Records are all written with the same TTL (7 days). Only queries are inserting a key (which we expect to only happen once) and checking whether that key exists in the table. In my 3.7 cluster I am using DateTieredCompaction and running on c3.4xlarge (x30) in AWS. I've been experimenting with i3.4xlarge and wanted to also try TimeWindowCompaction to see if we could get better performance when adding machines to the cluster, that was always a really painful experience in 3.7 with DateTieredCompaction and the docs say TimeWindowCompaction is ideal for my use case. Right now I am running a new cluster with 3.11 and TimeWindowCompaction alongside the old cluster and doing writes to both. Only reads go to the old cluster while I go through this preliminary testing. So the 3.11 cluster receives between 90K to 150K writes/second and no reads. This morning for a period of about 30 minutes the cluster was at 100% ioutil and eventually recovered from this state. At that time it was only receiving ~100K writes/second. I don't see anything interesting in the logs that indicate what is going on, and I don't think a sudden compaction is the issue since I have limits on compaction throughput. Staying on 3.7 would be a major bummer so looking for advice. Some information that might be useful: compaction throughput - 16MB/s concurrent compactors - 4 machine type - i3.4xlarge (x20) disk - RAID0 across 2 NVMe SSDs Table Schema looks like this: CREATE TABLE prod_dedupe.event_hashes ( app int, hash_value blob, PRIMARY KEY ((app, hash_value)) ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = 'For deduping' AND compaction = {'class': 'org.apache.cassandra.db.compa ction.TimeWindowCompactionStrategy', 'compaction_window_size': '4', 'compaction_window_unit': 'HOURS', 'max_threshold': '64', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '4', 'class': ' org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 3600 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; Thanks, Kurt --001a1140e474b568d3055de51ad4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

I've been testing = out cassandra 3.11 (currently using 3.7) and have been observing really hig= h io util occasionally that sometimes results in temporary flatlining at 10= 0% io util for an extended period. I think my use case is pretty simple and= currently only testing part of it on this new version so looking for advic= e on what might be going wrong.

Use = Case: I am using cassandra as basically a large "set", my table s= chema is incredibly simple, just a primary key. Records are all written wit= h the same TTL (7 days). Only queries are inserting a key (which we expect = to only happen once) and checking whether that key exists in the table. In = my 3.7 cluster I am using DateTieredCompaction and running on c3.4xlarge (x= 30) in AWS. I've been experimenting with i3.4xlarge and wanted to also = try TimeWindowCompaction to see if we could get better performance when add= ing machines to the cluster, that was always a really painful experience in= 3.7 with DateTieredCompaction and the docs say TimeWindowCompaction is ide= al for my use case.

Right now I am r= unning a new cluster with 3.11 and TimeWindowCompaction alongside the old c= luster and doing writes to both. Only reads go to the old cluster while I g= o through this preliminary testing. So the 3.11 cluster receives between 90= K to 150K writes/second and no reads. This morning for a period of about 30= minutes the cluster was at 100% ioutil and eventually recovered from this = state. At that time it was only receiving ~100K writes/second. I don't = see anything interesting in the logs that indicate what is going on, and I = don't think a sudden compaction is the issue since I have limits on com= paction throughput.

Staying on 3.7 w= ould be a major bummer so looking for advice.

Some information that might be useful:

compaction throughput - 16MB/s
concurrent compactors - 4machine type - i3.4xlarge (x20)
disk - RAID0 across 2 NVMe SSDs

Table Schema looks like this:

CREATE TABLE prod_dedupe.event_hashes (

=C2=A0 =C2=A0=C2=A0= app int,

=C2= =A0 =C2=A0=C2=A0hash_value blob,

=C2=A0 =C2=A0=C2=A0PRIMARY KEY ((app, has= h_value))

) WITH bloom_filter_fp_ch= ance =3D 0.01

=C2=A0 =C2=A0=C2=A0AND caching =3D {'keys': 'ALL', = 'rows_per_partition': 'NONE'}

=C2=A0 =C2=A0=C2=A0AND comment =3D = 'For deduping'

=C2=A0 =C2=A0=C2=A0AND compaction =3D {'class': &= #39;org.apache.cassandra.db.compaction.TimeWindowCompactionStrate= gy', 'compaction_window_size': '4', 'compaction_win= dow_unit': 'HOURS', 'max_threshold': '64', '= ;min_threshold': '4'}

<= span class=3D"gmail-m_94046456608280891gmail-m_-6716779653450205226gmail-Ap= ple-converted-space">=C2=A0 =C2=A0=C2=A0AND compression =3D {'ch= unk_length_in_kb': '4', 'class': 'org.apache.cassandra.io.co= mpress.LZ4Compressor'}

=C2=A0 =C2=A0=C2=A0AND crc_check_chance =3D 1.0

=C2=A0 =C2=A0= =C2=A0AND dclocal_read_repair_chance =3D 0.0

=C2=A0 =C2=A0=C2=A0AND defaul= t_time_to_live =3D 0

=C2=A0 =C2=A0=C2=A0AND gc_grace_seconds =3D 3600

<= p class=3D"gmail-m_94046456608280891gmail-m_-6716779653450205226gmail-p1" s= tyle=3D"font-size:12.8px">=C2=A0 =C2=A0=C2=A0AND max_index_interval =3D 2048

<= span class=3D"gmail-m_94046456608280891gmail-m_-6716779653450205226gmail-Ap= ple-converted-space">=C2=A0 =C2=A0=C2=A0AND memtable_flush_period_in= _ms =3D 0

=C2= =A0 =C2=A0=C2=A0AND min_index_interval =3D 128

=C2=A0 =C2=A0=C2=A0AND read= _repair_chance =3D 0.0

=C2=A0 =C2=A0=C2=A0AND speculative_retry =3D 'NONE';


Thanks,
Kurt
=

--001a1140e474b568d3055de51ad4--