Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 51D9E10E66 for ; Wed, 20 Nov 2013 04:41:40 +0000 (UTC) Received: (qmail 23155 invoked by uid 500); 20 Nov 2013 04:41:36 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 22733 invoked by uid 500); 20 Nov 2013 04:41:30 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 22702 invoked by uid 99); 20 Nov 2013 04:41:27 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Nov 2013 04:41:27 +0000 Date: Wed, 20 Nov 2013 04:41:27 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-6345) Endpoint cache invalidation causes CPU spike (on vnode rings?) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827329#comment-13827329 ] Jonathan Ellis edited comment on CASSANDRA-6345 at 11/20/13 4:40 AM: --------------------------------------------------------------------- v4 attached that uses a versioning approach like yours. I dropped the readLock acquire on version read since it's not necessary to block callers during the update. (A few extra over-broad replica set operations won't hurt.) was (Author: jbellis): v4 attached that uses a versioning approach like yours. I dropped the readLock acquire on version read since it's not necessary to block callers during the update. (A few extra over-broad replica set won't hurt.) > Endpoint cache invalidation causes CPU spike (on vnode rings?) > -------------------------------------------------------------- > > Key: CASSANDRA-6345 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6345 > Project: Cassandra > Issue Type: Bug > Environment: 30 nodes total, 2 DCs > Cassandra 1.2.11 > vnodes enabled (256 per node) > Reporter: Rick Branson > Assignee: Jonathan Ellis > Fix For: 1.2.12, 2.0.3 > > Attachments: 6345-rbranson-v2.txt, 6345-rbranson.txt, 6345-v2.txt, 6345-v3.txt, 6345-v4.txt, 6345.txt, half-way-thru-6345-rbranson-patch-applied.png > > > We've observed that events which cause invalidation of the endpoint cache (update keyspace, add/remove nodes, etc) in AbstractReplicationStrategy result in several seconds of thundering herd behavior on the entire cluster. > A thread dump shows over a hundred threads (I stopped counting at that point) with a backtrace like this: > at java.net.Inet4Address.getAddress(Inet4Address.java:288) > at org.apache.cassandra.locator.TokenMetadata$1.compare(TokenMetadata.java:106) > at org.apache.cassandra.locator.TokenMetadata$1.compare(TokenMetadata.java:103) > at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351) > at java.util.TreeMap.getEntry(TreeMap.java:322) > at java.util.TreeMap.get(TreeMap.java:255) > at com.google.common.collect.AbstractMultimap.put(AbstractMultimap.java:200) > at com.google.common.collect.AbstractSetMultimap.put(AbstractSetMultimap.java:117) > at com.google.common.collect.TreeMultimap.put(TreeMultimap.java:74) > at com.google.common.collect.AbstractMultimap.putAll(AbstractMultimap.java:273) > at com.google.common.collect.TreeMultimap.putAll(TreeMultimap.java:74) > at org.apache.cassandra.utils.SortedBiMultiValMap.create(SortedBiMultiValMap.java:60) > at org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:598) > at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:104) > at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2671) > at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375) > It looks like there's a large amount of cost in the TokenMetadata.cloneOnlyTokenMap that AbstractReplicationStrategy.getNaturalEndpoints is calling each time there is a cache miss for an endpoint. It seems as if this would only impact clusters with large numbers of tokens, so it's probably a vnodes-only issue. > Proposal: In AbstractReplicationStrategy.getNaturalEndpoints(), cache the cloned TokenMetadata instance returned by TokenMetadata.cloneOnlyTokenMap(), wrapping it with a lock to prevent stampedes, and clearing it in clearEndpointCache(). Thoughts? -- This message was sent by Atlassian JIRA (v6.1#6144)