Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 20D6E200D04 for ; Sun, 27 Aug 2017 23:01:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1F7ED163DCC; Sun, 27 Aug 2017 21:01:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4716A163DCB for ; Sun, 27 Aug 2017 23:01:07 +0200 (CEST) Received: (qmail 12596 invoked by uid 500); 27 Aug 2017 21:01:05 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 12585 invoked by uid 99); 27 Aug 2017 21:01:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Aug 2017 21:01:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2F91F1A29C0 for ; Sun, 27 Aug 2017 21:01:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 9tmV6Rg98FRn for ; Sun, 27 Aug 2017 21:01:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 980725FAF7 for ; Sun, 27 Aug 2017 21:01:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1C599E0C00 for ; Sun, 27 Aug 2017 21:01:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 7BFC625385 for ; Sun, 27 Aug 2017 21:01:00 +0000 (UTC) Date: Sun, 27 Aug 2017 21:01:00 +0000 (UTC) From: "Jeff Jirsa (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-13780) ADD Node streaming throughput performance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 27 Aug 2017 21:01:08 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-13780?page=3Dcom.atl= assian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13780: ----------------------------------- Priority: Major (was: Blocker) > ADD Node streaming throughput performance > ----------------------------------------- > > Key: CASSANDRA-13780 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1378= 0 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Linux 2.6.32-696.3.2.el6.x86_64 #1 SMP Mon Jun 19 11= :55:55 PDT 2017 x86_64 x86_64 x86_64 GNU/Linux > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 40 > On-line CPU(s) list: 0-39 > Thread(s) per core: 2 > Core(s) per socket: 10 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 79 > Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz > Stepping: 1 > CPU MHz: 2199.869 > BogoMIPS: 4399.36 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 25600K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,3= 6,38 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,3= 7,39 > total used free shared buffers cached > Mem: 252G 217G 34G 708K 308M 149G > -/+ buffers/cache: 67G 185G > Swap: 16G 0B 16G > Reporter: Kevin Rivait > Fix For: 3.0.9 > > > Problem: Adding a new node to a large cluster runs at least 1000x slower = than what the network and node hardware capacity can support, taking severa= l days per new node. Adjusting stream throughput and other YAML parameters= seems to have no effect on performance. Essentially, it appears that Cass= andra has an architecture scalability growth problem when adding new nodes = to a moderate to high data ingestion cluster because Cassandra cannot add n= ew node capacity fast enough to keep up with increasing data ingestion volu= mes and growth. > Initial Configuration:=20 > Running 3.0.9 and have implemented TWCS on one of our largest table. > Largest table partitioned on (ID, YYYYMM) using 1 day buckets with a TTL= of 60 days. > Next release will change partitioning to (ID, YYYYMMDD) so that partition= s are aligned with daily TWCS buckets. > Each node is currently creating roughly a 30GB SSTable per day. > TWCS working as expected, daily SSTables are dropping off daily after 70= days ( 60 + 10 day grace) > Current deployment is a 28 node 2 datacenter cluster, 14 nodes in each DC= , replication factor 3 > Data directories are backed with 4 - 2TB SSDs on each node and a 1 800GB= SSD for commit logs. > Requirement is to double cluster size, capacity, and ingestion volume wit= hin a few weeks. > Observed Behavior: > 1. streaming throughput during add node =E2=80=93 we observed maximum 6 M= b/s streaming from each of the 14 nodes on a 20Gb/s switched network, takin= g at least 106 hours for each node to join cluster and each node is only ab= out 2.2 TB is size. > 2. compaction on the newly added node - compaction has fallen behind, wit= h anywhere from 4,000 to 10,000 SSTables at any given time. It took 3 week= s for compaction to finish on each newly added node. Increasing number of= compaction threads to match number of CPU (40) and increasing compaction = throughput to 32MB/s seemed to be the sweet spot.=20 > 3. TWCS buckets on new node, data streamed to this node over 4 1/2 days. = Compaction correctly placed the data in daily files, but the problem is th= e file dates reflect when compaction created the file and not the date of t= he last record written in the TWCS bucket, which will cause the files to re= main around much longer than necessary. =20 > Two Questions: > 1. What can be done to substantially improve the performance of adding a = new node? > 2. Can compaction on TWCS partitions for newly added nodes change the fil= e create date to match the highest date record in the file -or- add another= piece of meta-data to the TWCS files that reflect the file drop date so th= at TWCS partitions can be dropped consistently? -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org