incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Nentwig <timo.nent...@toptarif.de>
Subject Re: Memory leak with Sun Java 1.6 ?
Date Tue, 14 Dec 2010 14:31:20 GMT

On Dec 14, 2010, at 14:41, Jonathan Ellis wrote:

> This is "A row has grown too large" section from that troubleshooting guide.

Why? This is what a typical "row" (?) looks like:

[default@test] list tracking limit 1;
-------------------
RowKey:  123
=> (column=key, value=foo, timestamp=1292238005886000)
=> (column=value, value=bar, timestamp=1292238005886000)

I'm importing the data from a RDBMS so I can say for sure that none of the 2 columns will
contain more than 255 chars.

1 Row Returned.

That row warning seems to be a 0.6 option, don't find it in 0.7 docs. Didn't find any related
WARNs for some default value in the log also.

> On Tue, Dec 14, 2010 at 5:27 AM, Timo Nentwig <timo.nentwig@toptarif.de> wrote:
> 
> On Dec 12, 2010, at 17:21, Jonathan Ellis wrote:
> 
> > http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
> 
> I can rule out the first 3. I was running cassandra with default settings, i.e. 1GB heap
and 256M memtable. So, with 3 memtables+1GB the JVM should run with >1.75G (although http://wiki.apache.org/cassandra/MemtableThresholds
considers to increase heap size only gently). Did so. 4GB machine with 2GB 64bit-JVM seemed
to run stable for quite some time but then also crashed with OOM. Looking at the heap dump
it's always the same: all memory nearly always bound in CompactionExecutor (ColumnFamilyStore/ConcurrentSkipListMap,
respectively).
> 
> This looks like somebody else recently have had a similar problem (->Bottom line:
more heap - which is okay, but I'd like to understand why):
> http://www.mail-archive.com/user@cassandra.apache.org/msg07516.html
> 
> This is my only CF currently in use (via JMX):
> 
> - column_families:
>  - column_type: Standard
>    comment: tracking column family
>    compare_with: org.apache.cassandra.db.marshal.UTF8Type
>    default_validation_class: org.apache.cassandra.db.marshal.UTF8Type
>    gc_grace_seconds: 864000
>    key_cache_save_period_in_seconds: 3600
>    keys_cached: 200000.0
>    max_compaction_threshold: 32
>    memtable_flush_after_mins: 60
>    min_compaction_threshold: 4
>    name: tracking
>    read_repair_chance: 1.0
>    row_cache_save_period_in_seconds: 0
>    rows_cached: 0.0
>  name: test
>  replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy
>  replication_factor: 3
> 
> 
> In addition...actually there is plenty of free memory on the heap (?):
> 
> 3605.478: [GC 3605.478: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     416112 bytes,     416112 total
> : 16887K->553K(38336K), 0.0209550 secs]3605.499: [CMS: 1145267K->447565K(2054592K),
1.9143630 secs] 1161938K->447565K(2092928K), [CMS Perm : 18186K->18158K(30472K)], 1.9355340
secs] [Times: user=1.95 sys=0.00, real=1.94 secs]
> 3607.414: [Full GC 3607.414: [CMS: 447565K->447453K(2054592K), 1.9694960 secs] 447565K->447453K(2092928K),
[CMS Perm : 18158K->18025K(30472K)], 1.9696450 secs] [Times: user=1.92 sys=0.00, real=1.97
secs]
> Total time for which application threads were stopped: 3.9070380 seconds
> Total time for which application threads were stopped: 7.3388640 seconds
> Total time for which application threads were stopped: 0.0560610 seconds
> 3616.931: [GC 3616.931: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:     474264 bytes,     474264 total
> : 34112K->747K(38336K), 0.0098680 secs] 481565K->448201K(2092928K), 0.0099690 secs]
[Times: user=0.00 sys=0.00, real=0.01 secs]
> Total time for which application threads were stopped: 0.0108670 seconds
> 3617.035: [GC 3617.035: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      63040 bytes,      63040 total
> : 34859K->440K(38336K), 0.0065950 secs] 482313K->448455K(2092928K), 0.0066880 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0075850 seconds
> 3617.133: [GC 3617.133: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      23016 bytes,      23016 total
> : 34552K->121K(38336K), 0.0042920 secs] 482567K->448193K(2092928K), 0.0043650 secs]
[Times: user=0.01 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049630 seconds
> 3617.228: [GC 3617.228: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      16992 bytes,      16992 total
> : 34233K->34K(38336K), 0.0043180 secs] 482305K->448122K(2092928K), 0.0043910 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0049150 seconds
> 3617.323: [GC 3617.323: [ParNew
> Desired survivor size 2162688 bytes, new threshold 1 (max 1)
> - age   1:      18456 bytes,      18456 total
> : 34146K->29K(38336K), 0.0038930 secs] 482234K->448127K(2092928K), 0.0039810 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]
> Total time for which application threads were stopped: 0.0055390 seconds
> Heap
>  par new generation   total 38336K, used 17865K [0x000000077ae00000, 0x000000077d790000,
0x000000077d790000)
>  eden space 34112K,  52% used [0x000000077ae00000, 0x000000077bf6afb0, 0x000000077cf50000)
>  from space 4224K,   0% used [0x000000077cf50000, 0x000000077cf57720, 0x000000077d370000)
>  to   space 4224K,   0% used [0x000000077d370000, 0x000000077d370000, 0x000000077d790000)
>  concurrent mark-sweep generation total 2054592K, used 448097K [0x000000077d790000, 0x00000007fae00000,
0x00000007fae00000)
>  concurrent-mark-sweep perm gen total 30472K, used 18125K [0x00000007fae00000, 0x00000007fcbc2000,
0x0000000800000000)
> 
> >
> > On Sun, Dec 12, 2010 at 9:52 AM, Timo Nentwig <timo.nentwig@toptarif.de> wrote:
> >
> > On Dec 10, 2010, at 19:37, Peter Schuller wrote:
> >
> > > To cargo cult it: Are you running a modern JVM? (Not e.g. openjdk b17
> > > in lenny or some such.) If it is a JVM issue, ensuring you're using a
> > > reasonably recent JVM is probably much easier than to start tracking
> > > it down...
> >
> > I had OOM problems with OpenJDK, switched to Sun/Oracle's recent 1.6.0_23 and...still
have the same problem :-\ Stack trace always looks the same:
> >
> > java.lang.OutOfMemoryError: Java heap space
> >        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> >        at java.nio.ByteBuffer.allocate(ByteBuffer.java:329)
> >        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:261)
> >        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
> >        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
> >        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
> >        at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
> >        at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:383)
> >        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:393)
> >        at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:351)
> >        at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:52)
> >        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:63)
> >        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> >        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> >        at java.lang.Thread.run(Thread.java:636)
> >
> > I'm writing from 1 client with 50 threads to a cluster of 4 machines (with hector).
With QUORUM and ONE 2 machines quite reliably will soon die with OOM. What may cause this?
Won't cassandra block/reject when memtable is full and being flushed to disk but grow and
if flushing to disk isn't fast enough will run out of memory?
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Mime
View raw message