Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F2C8189DC for ; Tue, 15 Dec 2015 12:15:47 +0000 (UTC) Received: (qmail 61305 invoked by uid 500); 15 Dec 2015 12:15:47 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 61261 invoked by uid 500); 15 Dec 2015 12:15:46 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 61239 invoked by uid 99); 15 Dec 2015 12:15:46 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Dec 2015 12:15:46 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id ADDC42C1F6B for ; Tue, 15 Dec 2015 12:15:46 +0000 (UTC) Date: Tue, 15 Dec 2015 12:15:46 +0000 (UTC) From: "Rafael Harutyunyan (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-10871) MemtableFlushWriter blocks and no flushing happens MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057948#comment-15057948 ] Rafael Harutyunyan edited comment on CASSANDRA-10871 at 12/15/15 12:14 PM: --------------------------------------------------------------------------- Yes, I have 14 servers. 12 with 64 vnodes, 2 with 256 vnodes (they are on a proportionally bigger hardware). The two nodes with 256 vnodes are the ones where I see this issue. The rest are fine. was (Author: rahar): Yes, I have 14 servers. 12 with 64 vnodes, 2 with 256 vnodes (they are on a proportionally bigger hardware) > MemtableFlushWriter blocks and no flushing happens > -------------------------------------------------- > > Key: CASSANDRA-10871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10871 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux; Java(TM) SE Runtime Environment (build 1.7.0_67-b01) > Reporter: Rafael Harutyunyan > Priority: Critical > Attachments: full_thread_dump.txt > > > After some time MemtableFlushWriter thread blocks, resulting first full filling of the FlushWriterQueue, than full filling of MutationStage queue. After this 2 things might happen - Cassandra might drop the queued mutations and everything becomes normal or it shuts down with insufficient HeapSpace. > Here is the thread dump. > {noformat} > "MemtableFlushWriter:3" - Thread t@2610 > java.lang.Thread.State: BLOCKED > at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250) > - waiting to lock (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by "CompactionExecutor:51" t@2638 > at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518) > at org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178) > at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234) > at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502) > at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - locked <7ef8cd1b> (a java.util.concurrent.ThreadPoolExecutor$Worker) > "MemtableFlushWriter:4" - Thread t@2616 > java.lang.Thread.State: BLOCKED > at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250) > - waiting to lock (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by "CompactionExecutor:51" t@2638 > at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518) > at org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178) > at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234) > at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502) > at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336) > at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Locked ownable synchronizers: > - locked <2f842d9b> (a java.util.concurrent.ThreadPoolExecutor$Worker) > {noformat} > and here are the tpsats > {noformat} > Pool Name Active Pending Completed Blocked All time blocked > CounterMutationStage 0 0 0 0 0 > ReadStage 0 0 28 0 0 > RequestResponseStage 0 0 2020253 0 0 > MutationStage 32 63221 27858588 0 0 > ReadRepairStage 0 0 0 0 0 > GossipStage 0 0 16430 0 0 > CacheCleanupExecutor 0 0 0 0 0 > AntiEntropyStage 0 0 3008 0 0 > MigrationStage 0 0 0 0 0 > Sampler 0 0 0 0 0 > ValidationExecutor 0 0 1500 0 0 > CommitLogArchiver 0 0 0 0 0 > MiscStage 0 0 0 0 0 > MemtableFlushWriter 2 220 3531 0 0 > MemtableReclaimMemory 0 0 4277 0 0 > PendingRangeCalculator 0 0 22 0 0 > MemtablePostFlush 1 306 5186 0 0 > CompactionExecutor 36 142 5326 0 0 > InternalResponseStage 0 0 0 0 0 > HintedHandoff 0 0 13 0 0 > Message type Dropped > RANGE_SLICE 0 > READ_REPAIR 0 > PAGED_RANGE 0 > BINARY 0 > READ 0 > MUTATION 220352 > _TRACE 0 > REQUEST_RESPONSE 0 > COUNTER_MUTATION 0 > {noformat} > cfstats reports 12k++ sstables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)