cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Cipar <jci...@cmu.edu>
Subject Re: Crash when uploading large data sets
Date Thu, 12 May 2011 23:52:03 GMT
It looks like MAX_HEAP_SIZE is set in cassandra-env.sh to be half of my physical memory.  These
are 15GB VMs, so that's 7.5GB for Cassandra.  I would have expected that to work, but I will
override to 13 GB just to see what happens.

I've also got the JNA thing set up.  Do you think this would cause the crashes, or is it just
a performance improvement?



On May 12, 2011, at 7:27 PM, Sameer Farooqui wrote:

> The key JVM options for Cassandra are in cassandra.in.sh.
> 
> What is your min and max heap size?
> 
> The default setting of max heap size is 1GB. How much RAM do your nodes have? You may
want to increase this setting. You can also set the -Xmx and -Xms options to the same value
to keep Java from having to manage heap growth. On a 32-bit machine, you can get a max of
about 1.6 GB of heap; you can get a lot more on 64-bit.
> 
> Try messing with some of the other settings in the cassandra.in.sh file.
> 
> You may not have DEBUG mode turned on for Cassandra and therefore may not be getting
the full details of what's going on when the server crashes. In the <cassandra-home>/conf/log4j-server.properties
file, set this line from the default of INFO to DEBUG:
> 
> log4j.rootLogger=INFO,stdout,R
> 
> 
> Also, you haven't configured JNA on this server. Here's some info about it and how to
configure it:
> 
> JNA provides Java programs easy access to native shared libraries without writing anything
but Java code.
> 
> Note from Cassandra developers for why JNA is needed:
> "Linux aggressively swaps out infrequently used memory to make more room for its file
system buffer cache. Unfortunately, modern generational garbage collectors like the JVM's
leave parts of its heap un-touched for relatively large amounts of time, leading Linux to
swap it out. When the JVM finally goes to use or GC that memory, swap hell ensues.
> 
> Setting swappiness to zero can mitigate this behavior but does not eliminate it entirely.
Turning off swap entirely is effective. But to avoid surprising people who don't know about
this behavior, the best solution is to tell Linux not to swap out the JVM, and that is what
we do now with mlockall via JNA.
> 
> Because of licensing issues, we can't distribute JNA with Cassandra, so you must manually
add it to the Cassandra lib/ directory or otherwise place it on the classpath. If the JNA
jar is not present, Cassandra will continue as before."
> 
> Get JNA with: 
> cd ~
> wget http://debian.riptano.com/debian/pool/libjna-java_3.2.7-0~nmu.2_amd64.deb
> 
> To install: 
> techlabs@cassandraN1:~$ sudo dpkg -i libjna-java_3.2.7-0~nmu.2_amd64.deb
> (Reading database ... 44334 files and directories currently installed.)
> Preparing to replace libjna-java 3.2.4-2 (using libjna-java_3.2.7-0~nmu.2_amd64.deb)
...
> Unpacking replacement libjna-java ...
> Setting up libjna-java (3.2.7-0~nmu.2) ...
> 
> 
> The deb package will install the JNA jar file to /usr/share/java/jna.jar, but Cassandra
only loads it if its in the class path. The easy way to do this is just create a symlink into
your Cassandra lib directory (note: replace /home/techlabs with your home dir location):
> ln -s /usr/share/java/jna.jar /home/techlabs/apache-cassandra-0.7.0/lib
> 
> Research:
> http://journal.paul.querna.org/articles/2010/11/11/enabling-jna-in-cassandra/
> 
> 
> - Sameer
> 
> 
> On Thu, May 12, 2011 at 4:15 PM, James Cipar <jcipar@cmu.edu> wrote:
> I'm using Cassandra 0.7.5, and uploading about 200 GB of data total (20 GB unique data),
to a cluster of 10 servers.  I'm using batch_mutate, and breaking the data up into chunks
of about 10k records.  Each record is about 5KB, so a total of about 50MB per batch.  When
I upload a smaller 2 GB data set, everything works fine.  When I upload the 20 GB data set,
servers will occasionally crash.  Currently I have my client code automatically detect this
and restart the server, but that is less than ideal.
> 
> I'm not sure what information to gather to determine what's going on here.  Here is a
sample of a log file from when a crash occurred.  The crash was immediately after the log
entry tagged "2011-05-12 19:02:19,377".  Any idea what's going on here?  Any other info I
can gather to try to debug this?
> 
> 
> 
> 
> 
> 
> 
>  INFO [ScheduledTasks:1] 2011-05-12 19:02:07,855 GCInspector.java (line 128) GC for ParNew:
375 ms, 576641232 reclaimed leaving 5471432144 used; max is 7774142464
>  INFO [ScheduledTasks:1] 2011-05-12 19:02:08,857 GCInspector.java (line 128) GC for ParNew:
450 ms, -63738232 reclaimed leaving 5546942544 used; max is 7774142464
>  INFO [COMMIT-LOG-WRITER] 2011-05-12 19:02:10,652 CommitLogSegment.java (line 50) Creating
new commitlog segment /mnt/scratch/jcipar/cassandra/commitlog/CommitLog-1305241330652.log
>  INFO [MutationStage:24] 2011-05-12 19:02:10,680 ColumnFamilyStore.java (line 1070) Enqueuing
flush of Memtable-Standard1@1256245282(51921529 bytes, 1115783 operations)
>  INFO [FlushWriter:1] 2011-05-12 19:02:10,680 Memtable.java (line 158) Writing Memtable-Standard1@1256245282(51921529
bytes, 1115783 operations)
>  INFO [ScheduledTasks:1] 2011-05-12 19:02:12,932 GCInspector.java (line 128) GC for ParNew:
249 ms, 571827736 reclaimed leaving 3165899760 used; max is 7774142464
>  INFO [ScheduledTasks:1] 2011-05-12 19:02:15,253 GCInspector.java (line 128) GC for ParNew:
341 ms, 561823592 reclaimed leaving 1764208800 used; max is 7774142464
>  INFO [FlushWriter:1] 2011-05-12 19:02:16,743 Memtable.java (line 165) Completed flushing
/mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-74-Data.db (53646223 bytes)
>  INFO [COMMIT-LOG-WRITER] 2011-05-12 19:02:16,745 CommitLog.java (line 440) Discarding
obsolete commit log:CommitLogSegment(/mnt/scratch/jcipar/cassandra/commitlog/CommitLog-1305241306438.log)
>  INFO [ScheduledTasks:1] 2011-05-12 19:02:18,256 GCInspector.java (line 128) GC for ParNew:
305 ms, 544491840 reclaimed leaving 865198712 used; max is 7774142464
>  INFO [MutationStage:19] 2011-05-12 19:02:19,000 ColumnFamilyStore.java (line 1070) Enqueuing
flush of Memtable-Standard1@479849353(51941121 bytes, 1115783 operations)
>  INFO [FlushWriter:1] 2011-05-12 19:02:19,000 Memtable.java (line 158) Writing Memtable-Standard1@479849353(51941121
bytes, 1115783 operations)
>  INFO [NonPeriodicTasks:1] 2011-05-12 19:02:19,310 SSTable.java (line 147) Deleted /mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-51
>  INFO [NonPeriodicTasks:1] 2011-05-12 19:02:19,324 SSTable.java (line 147) Deleted /mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-55
>  INFO [NonPeriodicTasks:1] 2011-05-12 19:02:19,339 SSTable.java (line 147) Deleted /mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-58
>  INFO [NonPeriodicTasks:1] 2011-05-12 19:02:19,357 SSTable.java (line 147) Deleted /mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-67
>  INFO [NonPeriodicTasks:1] 2011-05-12 19:02:19,377 SSTable.java (line 147) Deleted /mnt/scratch/jcipar/cassandra/data/Keyspace1/Standard1-f-61
>  INFO [main] 2011-05-12 19:02:21,026 AbstractCassandraDaemon.java (line 78) Logging initialized
>  INFO [main] 2011-05-12 19:02:21,040 AbstractCassandraDaemon.java (line 96) Heap size:
7634681856/7635730432
>  INFO [main] 2011-05-12 19:02:21,042 CLibrary.java (line 61) JNA not found. Native methods
will be disabled.
>  INFO [main] 2011-05-12 19:02:21,052 DatabaseDescriptor.java (line 121) Loading settings
from file:/h/jcipar/Projects/HP/OtherDBs/Cassandra/apache-cassandra-0.7.5/conf/cassandra.yaml
>  INFO [main] 2011-05-12 19:02:21,178 DatabaseDescriptor.java (line 181) DiskAccessMode
'auto' determined to be mmap, indexAccessMode is mmap
>  INFO [main] 2011-05-12 19:02:21,310 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/Schema-f-1
>  INFO [main] 2011-05-12 19:02:21,327 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/Schema-f-2
>  INFO [main] 2011-05-12 19:02:21,336 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/Migrations-f-1
>  INFO [main] 2011-05-12 19:02:21,337 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/Migrations-f-2
>  INFO [main] 2011-05-12 19:02:21,342 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/LocationInfo-f-2
>  INFO [main] 2011-05-12 19:02:21,344 SSTableReader.java (line 154) Opening /mnt/scratch/jcipar/cassandra/data/system/LocationInfo-f-1
>  INFO [main] 2011-05-12 19:02:21,379 DatabaseDescriptor.java (line 461) Loading schema
version 9467ffe0-7cea-11e0-8ddc-f74ef74e382f
> 


Mime
View raw message