cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Kjellman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5736) CQL3PagingRecordReader can OOM and kill nodes
Date Tue, 09 Jul 2013 07:07:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703000#comment-13703000
] 

Michael Kjellman commented on CASSANDRA-5736:
---------------------------------------------

I think this is actually a manifestation of CASSANDRA-5677
                
> CQL3PagingRecordReader can OOM and kill nodes
> ---------------------------------------------
>
>                 Key: CASSANDRA-5736
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5736
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.2.6
>            Reporter: Michael Kjellman
>
> It looks like the CQL3PagingRecordReader will end up OOMing many nodes in a cluster as
the OOM/GC Storm due to ReadStage
> This is the stack trace from all of the ReadStage threads:
> {code}
> org.apache.cassandra.db.marshal.DateType.compare(DateType.java:62)
> org.apache.cassandra.db.marshal.DateType.compare(DateType.java:32)
> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.mergeHi(TimSort.java:806)
> java.util.TimSort.mergeAt(TimSort.java:485)
> java.util.TimSort.mergeForceCollapse(TimSort.java:426)
> java.util.TimSort.sort(TimSort.java:223)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:181)
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
> org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
> org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:106)
> org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:79)
> org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> org.apache.cassandra.db.ColumnFamilyStore$6.computeNext(ColumnFamilyStore.java:1432)
> org.apache.cassandra.db.ColumnFamilyStore$6.computeNext(ColumnFamilyStore.java:1428)
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1499)
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1476)
> org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
> org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:58)
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
> {code}
> As best I can tell this is related to any row with > 5ish tombstones and has something
to do with DeletionInfo trying to sort the results. Only way to fix this was to rolling restart
all of the nodes in the cluster as the ReadStage threads appeared to be making no progress
(most likely due to GC..)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message