Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD6CF110E0 for ; Fri, 6 Jun 2014 17:04:03 +0000 (UTC) Received: (qmail 56442 invoked by uid 500); 6 Jun 2014 17:04:02 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 56396 invoked by uid 500); 6 Jun 2014 17:04:02 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 56326 invoked by uid 99); 6 Jun 2014 17:04:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Jun 2014 17:04:02 +0000 Date: Fri, 6 Jun 2014 17:04:02 +0000 (UTC) From: "Jacek Furmankiewicz (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7361) Cassandra locks up in full GC when you assign the entire heap to row cache MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14020052#comment-14020052 ] Jacek Furmankiewicz commented on CASSANDRA-7361: ------------------------------------------------ Well, we definitely have some sort of issue. I have JNA installed: {quote} sudo apt-get install libjna-java Reading package lists... Done Building dependency tree Reading state information... Done libjna-java is already the newest version. libjna-java set to manually installed. {quote} But when I start 2.0.8 manually from the bin folder, it says {quote} NFO 12:00:08,299 JNA not found. Native methods will be disabled. {quote} Considering that JNA is such a critical requirement and the row cache is 2x the same of the heap, it seems wrong to me that Cassandra just starts and gets itself all locked up within a few hours. > Cassandra locks up in full GC when you assign the entire heap to row cache > -------------------------------------------------------------------------- > > Key: CASSANDRA-7361 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7361 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Ubuntu, RedHat, JDK 1.7 > Reporter: Jacek Furmankiewicz > Priority: Minor > > We have a long running batch load process, which runs for many hours. > Massive amount of writes, in large mutation batches (we increase the thrift frame size to 45 MB). > Everything goes well, but after about 3 hrs of processing everything locks up. We start getting NoHostsAvailable exceptions on the Java application side (with Astyanax as our driver), eventually socket timeouts. > Looking at Cassandra, we can see that it is using nearly the full 8GB of heap and unable to free it. It spends most of its time in full GC, but the amount of memory does not go down. > Here is a long sample from jstat to show this over an extended time period > e.g. > http://aep.appspot.com/display/NqqEagzGRLO_pCP2q8hZtitnuVU/ > This continues even after we shut down our app. Nothing is connected to Cassandra any more, yet it is still stuck in full GC and cannot free up memory. > Running nodetool tpstats shows that nothing is pending, all seems OK: > {quote} > Pool Name Active Pending Completed Blocked All time blocked > ReadStage 0 0 69555935 0 0 > RequestResponseStage 0 0 0 0 0 > MutationStage 0 0 73123690 0 0 > ReadRepairStage 0 0 0 0 0 > ReplicateOnWriteStage 0 0 0 0 0 > GossipStage 0 0 0 0 0 > CacheCleanupExecutor 0 0 0 0 0 > MigrationStage 0 0 46 0 0 > MemoryMeter 0 0 1125 0 0 > FlushWriter 0 0 824 0 30 > ValidationExecutor 0 0 0 0 0 > InternalResponseStage 0 0 23 0 0 > AntiEntropyStage 0 0 0 0 0 > MemtablePostFlusher 0 0 1783 0 0 > MiscStage 0 0 0 0 0 > PendingRangeCalculator 0 0 1 0 0 > CompactionExecutor 0 0 74330 0 0 > commitlog_archiver 0 0 0 0 0 > HintedHandoff 0 0 0 0 0 > Message type Dropped > RANGE_SLICE 0 > READ_REPAIR 0 > PAGED_RANGE 0 > BINARY 0 > READ 585 > MUTATION 75775 > _TRACE 0 > REQUEST_RESPONSE 0 > COUNTER_MUTATION 0 > {quote} > We had this happen on 2 separate boxes, one with 2.0.6, the other with 2.0.8. > Right now this is a total blocker for us. We are unable to process the customer data and have to abort in the middle of large processing. > This is a new customer, so we did not have a chance to see if this occurred with 1.1 or 1.2 in the past (we moved to 2.0 recently). > We have the Cassandra process still running, pls let us know if there is anything else we could run to give you more insight. -- This message was sent by Atlassian JIRA (v6.2#6252)