Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Thu, 2 Apr 2015 14:45:55 +0000 (UTC)
From: "Sergey Maznichenko (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12787468.1427922296000.106868.1427985955177@Atlassian.JIRA>
In-Reply-To: <JIRA.12787468.1427922296000@Atlassian.JIRA>
References: <JIRA.12787468.1427922296000@Atlassian.JIRA>
 <JIRA.12787468.1427922296833@arcas>
Subject: [jira] [Updated] (CASSANDRA-9092) Nodes in DC2 die during and after
 huge write workload
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/CASSANDRA-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Maznichenko updated CASSANDRA-9092:
------------------------------------------
    Description: 
Hello,

We have Cassandra 2.1.2 with 8 nodes, 4 in DC1 and 4 in DC2.
Node is VM 8 CPU, 32GB RAM
During significant workload (loading several millions blobs ~3.5MB each), 1 node in DC2 stops and after some time next 2 nodes in DC2 also stops.
Now, 2 of nodes in DC2 do not work and stops after 5-10 minutes after start. I see many files in system.hints table and error appears in 2-3 minutes after starting system.hints auto compaction.

Stops, means "ERROR [CompactionExecutor:1] 2015-04-01 23:33:44,456 CassandraDaemon.java:153 - Exception in thread Thread[CompactionExecutor:1,1,main]
java.lang.OutOfMemoryError: Java heap space"

Full errors listing attached in 

The problem exists only in DC2. We have 1GbE between DC1 and DC2.


  was:
Hello,

We have Cassandra 2.1.2 with 8 nodes, 4 in DC1 and 4 in DC2.
Node is VM 8 CPU, 32GB RAM
During significant workload (loading several millions blobs ~3.5MB each), 1 node in DC2 stops and after some time next 2 nodes in DC2 also stops.
Now, 2 of nodes in DC2 do not work and stops after 5-10 minutes after start. I see many files in system.hints table and error appears in 2-3 minutes after starting system.hints auto compaction.

Stops, means "ERROR [CompactionExecutor:1] 2015-04-01 23:33:44,456 CassandraDaemon.java:153 - Exception in thread Thread[CompactionExecutor:1,1,main]
java.lang.OutOfMemoryError: Java heap space"

The problem exists only in DC2. We have 1GbE between DC1 and DC2.


> Nodes in DC2 die during and after huge write workload
> -----------------------------------------------------
>
>                 Key: CASSANDRA-9092
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9092
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: CentOS 6.2 64-bit, Cassandra 2.1.2, 
> java version "1.7.0_71"
> Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
>            Reporter: Sergey Maznichenko
>             Fix For: 2.1.5
>
>         Attachments: cassandra_crash1.txt
>
>
> Hello,
> We have Cassandra 2.1.2 with 8 nodes, 4 in DC1 and 4 in DC2.
> Node is VM 8 CPU, 32GB RAM
> During significant workload (loading several millions blobs ~3.5MB each), 1 node in DC2 stops and after some time next 2 nodes in DC2 also stops.
> Now, 2 of nodes in DC2 do not work and stops after 5-10 minutes after start. I see many files in system.hints table and error appears in 2-3 minutes after starting system.hints auto compaction.
> Stops, means "ERROR [CompactionExecutor:1] 2015-04-01 23:33:44,456 CassandraDaemon.java:153 - Exception in thread Thread[CompactionExecutor:1,1,main]
> java.lang.OutOfMemoryError: Java heap space"
> Full errors listing attached in 
> The problem exists only in DC2. We have 1GbE between DC1 and DC2.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)