cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Kjellman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-5736) CQL3PagingRecordReader can OOM and kill nodes
Date Tue, 09 Jul 2013 06:41:48 GMT
Michael Kjellman created CASSANDRA-5736:
-------------------------------------------

             Summary: CQL3PagingRecordReader can OOM and kill nodes
                 Key: CASSANDRA-5736
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5736
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
    Affects Versions: 1.2.6
            Reporter: Michael Kjellman


It looks like the CQL3PagingRecordReader will end up OOMing many nodes in a cluster as the
OOM/GC Storm due to ReadStage

{code}
org.apache.cassandra.db.marshal.DateType.compare(DateType.java:62)
org.apache.cassandra.db.marshal.DateType.compare(DateType.java:32)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.mergeHi(TimSort.java:806)
java.util.TimSort.mergeAt(TimSort.java:485)
java.util.TimSort.mergeForceCollapse(TimSort.java:426)
java.util.TimSort.sort(TimSort.java:223)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:181)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:106)
org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:79)
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.ColumnFamilyStore$6.computeNext(ColumnFamilyStore.java:1432)
org.apache.cassandra.db.ColumnFamilyStore$6.computeNext(ColumnFamilyStore.java:1428)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1499)
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1476)
org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:58)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

As best I can tell this is related to any row with > 5ish tombstones and has something
to do with DeletionInfo trying to sort the results. Only way to fix this was to rolling restart
all of the nodes in the cluster as the ReadStage threads appeared to be making no progress
(most likely due to GC..)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message