cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Kovgan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10937) OOM on multiple nodes during test of insert load
Date Thu, 24 Dec 2015 07:59:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070702#comment-15070702
] 

Peter Kovgan edited comment on CASSANDRA-10937 at 12/24/15 7:59 AM:
--------------------------------------------------------------------

Important:
Found on one node 224Gb of hints.
And this hints directory is on small disk(the save disk for logs).
The disk is full.
May be this is the part of the problem (other nodes are OK).


was (Author: tierhetze):
Important:
Found on one node 224Gb of hints.
And this hints directory is on small disk.
The disk is full.
May be this is the part of the problem (other nodes are OK).

> OOM on multiple nodes during test of insert load
> ------------------------------------------------
>
>                 Key: CASSANDRA-10937
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10937
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra : 3.0.0
> Installed as open archive, no connection to any OS specific installer.
> Java:
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> OS :
> Linux version 2.6.32-431.el6.x86_64 (mockbuild@x86-023.build.eng.bos.redhat.com) (gcc
version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Sun Nov 10 22:19:54 EST 2013
> We have:
> 8 guests ( Linux OS as above) on 2 VMWare managed physical IBM M6 hosts. Each physical
host keeps 4 guests.
> Each guest assigned to have:
> 1 disk 300 Gb for seq. log (NOT SSD)
> 1 disk 4T for data (NOT SSD)
> 11 CPU cores
> Disks are local, not shared.
> Memory on each host -  24 Gb total.
> 8 Gb - cassandra heap
> (lshw and cpuinfo attached in file)
>            Reporter: Peter Kovgan
>         Attachments: more-logs.rar, some-heap-stats.rar, test2.rar
>
>
> 8 cassandra nodes.
> Load test started with 4 clients(different and not equal machines), each running 1000
threads.
> Each thread assigned in round-robin way to run one of 4 different inserts. 
> Consistency->ONE.
> I attach the full CQL schema of tables and the query of insert.
> Replication factor - 2:
> create keyspace OBLREPOSITORY_NY with replication = {'class':'NetworkTopologyStrategy','NY':2};
> Initiall throughput is:
> 215.000  inserts /sec
> or
> 54Mb/sec, considering single insert size a bit larger than 256byte.
> Data:
> all fields(5-6) are short strings, except one is BLOB of 256 bytes.
> After about a 2-3 hours of work, I was forced to increase timeout from 2000 to 5000ms,
for some requests failed for short timeout.
> Later on(after aprox. 12 hous of work) OOM happens on multiple nodes.
> (all failed nodes logs attached)
> I attach also java load client and instructions how set-up and use it.
> The test is important for our strategic project and we hope it is curable.
> Attachments:
> test2.rar -contains most of material
> more-logs.rar - contains additional nodes logs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message