incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: cassandra out of heap space crash
Date Thu, 10 Jun 2010 21:38:53 GMT
I can't say exactly how much memory is the correct amount, but surely 1G is
very little.
By replicating 3 times your cluster now makes 3 times more work than it used
to do, both on reads and on writes while the readers/writers continue
hammering it the same pace.

So once you've upped your memory (try 4g, if not enough 8g etc) if this
still doesn't help, you want to look at either adding capacity or slowing
down your writes.
Which consistency level are you writing with? You can try ALL, this will
slow down your writes just as much needed by the cluster to catch its breath
(or so I hope, I never actually tried that...)

On Fri, Jun 11, 2010 at 12:26 AM, Julie <julie.sugar@nextcentury.com> wrote:

> I am running an 8 node cassandra cluster with each node on its own
> dedicated VM.
>
> My app very quickly populates the database with about 100,000 rows of data
> (each row is about 100K bytes) times the number of nodes in my cluster so
> there's about 100,000 rows of data on each node (seems very evenly
> distributed).
>
> I have been running my app fairly successfully but today changed the
> replication
> factor from 1 to 3. (I first took down the servers, nuked their data
> directories, copied over the new storage-conf.xml to each node, then
> restarted
> the servers.)  My app begins by populating the database with fresh data.
>  During
> the writing phase, all the cassandra servers, one by one, started getting
> an
> out-of-memory exception.  Here's the output from the first to die:
>
> INFO [COMMIT-LOG-WRITER] 2010-06-10 14:18:54,609 CommitLog.java (line 407)
> Discarding obsolete commit
>
> log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1276193883235.log)
>
> INFO [ROW-MUTATION-STAGE:5] 2010-06-10 14:18:55,499 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(Standard1)@19571399
>
> INFO [GMFD:1] 2010-06-10 14:19:01,556 Gossiper.java (line 568)
> InetAddress /10.210.69.221 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:35,136 Gossiper.java (line 568)
> InetAddress /10.254.242.228 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:35,137 Gossiper.java (line 568)
> InetAddress /10.201.207.129 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:36,922 Gossiper.java (line 568)
> InetAddress /10.198.37.241 is now UP
>
> INFO [GC inspection] 2010-06-10 14:19:03,722 GCInspector.java (line 110)
> GC for ConcurrentMarkSweep: 2164 ms, 8754168 reclaimed leaving 1070909048
> used;
> max is 1174339584
> INFO [GC inspection] 2010-06-10 14:21:09,068 GCInspector.java (line 110) GC
> for
> ConcurrentMarkSweep: 2151 ms, 78896080 reclaimed leaving 994679752 used;
> max is
> 1174339584
> INFO [Timer-1] 2010-06-10 14:21:09,068 Gossiper.java (line 179)
> InetAddress /10.198.37.241 is now dead.
> INFO [Timer-1] 2010-06-10 14:21:12,045 Gossiper.java (line 179)
> InetAddress /10.210.69.221 is now dead.
>  INFO [GMFD:1] 2010-06-10 14:21:12,046 Gossiper.java (line 568)
> InetAddress /10.210.203.210 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.210.69.221 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.192.218.117 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.198.37.241 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,307 Gossiper.java (line 568)
> InetAddress /10.254.138.226 is now UP
> ERROR [ROW-MUTATION-STAGE:25] 2010-06-10 14:21:15,127 CassandraDaemon.java
> (line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:25,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at
>
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:84)
>        at
>
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:29)
>        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns
> (ColumnFamilySerializer.java:117)
>        at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize
> (ColumnFamilySerializer.java:108)
>        at
> org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps
> (RowMutation.java:359)
>        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize
> (RowMutation.java:369)
>        at
> org.apache.cassandra.db.RowMutationSerializer.deserialize
> (RowMutation.java:322)
>        at
> org.apache.cassandra.db.RowMutationVerbHandler.doVerb
> (RowMutationVerbHandler.java:45)
>        at
> org.apache.cassandra.net.MessageDeliveryTask.run
> (MessageDeliveryTask.java:40)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask
> (ThreadPoolExecutor.java:886)
>        at
> java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:619)
> ERROR [ROW-MUTATION-STAGE:18] 2010-06-10 14:21:15,129 CassandraDaemon.java
> (line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:18,5,main]
>
>
>
> Within 15 minutes, all 8 nodes died while my app continued trying to
> populate
> the database.  Is there something I am doing wrong?  I am populating the
> database very quickly by writing 100 rows at once in each of 8 clients,
> until
> each client has written 100,000 rows.   All of my cassandra servers are
> started
> up with 1GB of heap space:  /usr/bin/java -ea -Xms128M -Xmx1G …
>
> Thank you for your help!
> Julie
>
>

Mime
View raw message