cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Corentin Chary (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
Date Tue, 11 Apr 2017 14:42:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964465#comment-15964465
] 

Corentin Chary commented on CASSANDRA-13432:
--------------------------------------------

I checked, 3.x has a different code to count tombstones so it's likely not affected

> MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13432
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13432
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Corentin Chary
>             Fix For: 2.1.x
>
>
> This might affect 3.x too, I'm not sure.
> {code}
> $ nodetool tpstats
> Pool Name                    Active   Pending      Completed   Blocked  All time blocked
> MutationStage                     0         0       32135875         0              
  0
> ReadStage                       114         0       29492940         0              
  0
> RequestResponseStage              0         0       86090931         0              
  0
> ReadRepairStage                   0         0         166645         0              
  0
> CounterMutationStage              0         0              0         0              
  0
> MiscStage                         0         0              0         0              
  0
> HintedHandoff                     0         0             47         0              
  0
> GossipStage                       0         0         188769         0              
  0
> CacheCleanupExecutor              0         0              0         0              
  0
> InternalResponseStage             0         0              0         0              
  0
> CommitLogArchiver                 0         0              0         0              
  0
> CompactionExecutor                0         0          86835         0              
  0
> ValidationExecutor                0         0              0         0              
  0
> MigrationStage                    0         0              0         0              
  0                                    
> AntiEntropyStage                  0         0              0         0              
  0                                    
> PendingRangeCalculator            0         0             92         0              
  0                                    
> Sampler                           0         0              0         0              
  0                                    
> MemtableFlushWriter               0         0            563         0              
  0                                    
> MemtablePostFlush                 0         0           1500         0              
  0                                    
> MemtableReclaimMemory             1        29            534         0              
  0                                    
> Native-Transport-Requests        41         0       54819182         0              1896
                           
> {code}
> {code}
> "MemtableReclaimMemory:195" - Thread t@6268
>    java.lang.Thread.State: WAITING
> 	at sun.misc.Unsafe.park(Native Method)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> 	at org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
> 	at org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417)
> 	at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> 	- locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "SharedPool-Worker-195" - Thread t@989
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690)
> 	at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650)
> 	at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171)
> 	at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143)
> 	at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240)
> 	at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483)
> 	at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153)
> 	at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184)
> 	at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156)
> 	at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125)
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99)
> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:263)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:108)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:316)
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2015)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1858)
> 	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353)
> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
> 	at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> 	at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> 	- None
> "SharedPool-Worker-206" - Thread t@1014
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690)
> 	at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650)
> 	at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171)
> 	at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143)
> 	at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240)
> 	at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483)
> 	at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153)
> 	at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184)
> 	at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156)
> 	at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
> 	at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:89)
> 	at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:48)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:105)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:82)
> 	at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
> 	at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:316)
> 	at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:2015)
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1858)
> 	at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:353)
> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:85)
> 	at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> 	at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
> 	- None
> {code}
> As you can see MemtableReclaimMemory is waiting on the read barrier to be released, but
there are two queries currently being executed which are locking this.
> Since most of the time is spent pretty low in the stack, these read operations will never
timeout (they are reading rows with tons of tombstones).
> We also can easily detect or purge the offending line because there is no easy way to
find out which partition is currently being read.
> The TombstoneFailureThreshold should also protect us, but it is probably being checked
too high in the call stack.
> Looks like RangeTombstoneList or DeletionInfo should also check for DatabaseDescriptor.getTombstoneFailureThreshold()



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message