cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Tang <ares.t...@gmail.com>
Subject Re: Cassandra take 100% CPU for 2~3 minutes every half an hour and mutation lost
Date Fri, 13 Jul 2012 03:28:11 GMT
Hi

After change the parameter of concurrent compactor, we can limit Cassandra
to use 100% of one core at that moment. (concurrent_compactors: 1)

And I got the stack of the "crazy" thread, it last 2~3 minutes, on same
stack.

Any clue of this issue?

Thread 18114: (state = IN_JAVA)

 - java.util.AbstractList$Itr.hasNext() @bci=8, line=339 (Compiled frame;
information may be imprecise)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedStandard(org.apache.cassandra.db.ColumnFamily,
int) @bci=6, line=841 (Compiled frame)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeletedColumnsOnly(org.apache.cassandra.db.ColumnFamily,
int) @bci=17, line=835 (Compiled frame)

 -
org.apache.cassandra.db.ColumnFamilyStore.removeDeleted(org.apache.cassandra.db.ColumnFamily,
int) @bci=8, line=826 (Compiled frame)

 -
org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(org.apache.cassandra.db.DecoratedKey,
org.apache.cassandra.db.compaction.CompactionController,
org.apache.cassandra.db.ColumnFamily) @bci=38, line=77 (Compiled frame)

 -
org.apache.cassandra.db.compaction.PrecompactedRow.<init>(org.apache.cassandra.db.compaction.CompactionController,
java.util.List) @bci=33, line=102 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(java.util.List)
@bci=223, line=133 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced()
@bci=44, line=102 (Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced()
@bci=1, line=87 (Compiled frame)

 - org.apache.cassandra.utils.MergeIterator$ManyToOne.consume() @bci=88,
line=116 (Compiled frame)

 - org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext() @bci=5,
line=99 (Compiled frame)

 - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
line=140 (Compiled frame)

 - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
(Compiled frame)

 - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=614
(Compiled frame)

 - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
line=140 (Compiled frame)

 - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
(Compiled frame)

 -
org.apache.cassandra.db.compaction.CompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
@bci=542, line=141 (Compiled frame)

 - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=117,
line=134 (Interpreted frame)

 - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=1,
line=114 (Interpreted frame)

 - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303
(Interpreted frame)

 - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Interpreted
frame)

 -
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
@bci=59, line=886 (Compiled frame)

 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908
(Compiled frame)

 - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)



BRs

//Jason



2012/7/11 Jason Tang <ares.tang@gmail.com>

> Hi
>
>     I encounter the High CPU problem, Cassandra 1.0.3, happened on both
> sized and leveled compaction, 6G heap, 64bit Oracle java. For normal
> traffic, Cassandra will use 15% CPU.
>
>     But every half a hour, Cassandra will use almost 100% total cpu (SUSE,
> 12 Core).
>
>     And here is the top information for that moment.
>
> #top -H -p 12451
>
> top - 12:30:14 up 15 days, 12:49,  6 users,  load average: 10.52, 8.92,
> 8.14
> Tasks: 706 total,  21 running, 685 sleeping,   0 stopped,   0 zombie
> Cpu(s): 25.7%us, 14.0%sy, 48.9%ni,  6.5%id,  0.0%wa,  0.0%hi,  4.9%si,
>  0.0%st
> Mem:     24150M total,    12218M used,    11932M free,      142M buffers
> Swap:        0M total,        0M used,        0M free,     3714M cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 20291 casadm    24   4 8003m 5.4g 167m R   92 22.7   0:42.46 java
> 20276 casadm    24   4 8003m 5.4g 167m R   88 22.7   0:43.88 java
> 20181 casadm    24   4 8003m 5.4g 167m R   86 22.7   0:52.97 java
> 20213 casadm    24   4 8003m 5.4g 167m R   85 22.7   0:49.21 java
> 20188 casadm    24   4 8003m 5.4g 167m R   82 22.7   0:54.34 java
> 20268 casadm    24   4 8003m 5.4g 167m R   81 22.7   0:46.25 java
> 20269 casadm    24   4 8003m 5.4g 167m R   41 22.7   0:15.11 java
> 20316 casadm    24   4 8003m 5.4g 167m S   20 22.7   0:02.35 java
> 20191 casadm    24   4 8003m 5.4g 167m R   15 22.7   0:16.85 java
> 12500 casadm    20   0 8003m 5.4g 167m R    6 22.7   1:07.86 java
> 15245 casadm    20   0 8003m 5.4g 167m D    5 22.7   0:36.45 java
>
>     Jstack can not print the stack.
> Thread 20291: (state = IN_JAVA)
> Error occurred during stack walking:
> ...
> Thread 20276: (state = IN_JAVA)
> Error occurred during stack walking:
>
>     After it come back, the stack shows:
> Thread 20291: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object,
> long) @bci=20, line=196 (Compiled frame)
>  -
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode,
> boolean, long) @bci=174, line=424 (Compiled frame)
>  -
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
> boolean, long) @bci=102, line=323 (Compiled frame)
>  - java.util.concurrent.SynchronousQueue.poll(long,
> java.util.concurrent.TimeUnit) @bci=11, line=874 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=62, line=945
> (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=18, line=907
> (Compiled frame)
>  - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame
>
>     And after this happened, the data is not correct, some
> large column which suppose to be deleted, come back again.
>     Here is the suspect thread when it use up 100%
> Thread 20191: (state = IN_VM)
>  - sun.misc.Unsafe.unpark(java.lang.Object) @bci=0 (Compiled frame;
> information may be imprecise)
>  - java.util.concurrent.locks.LockSupport.unpark(java.lang.Thread) @bci=8,
> line=122 (Compiled frame)
>  -
> java.util.concurrent.SynchronousQueue$TransferStack$SNode.tryMatch(java.util.concurrent.SynchronousQueue$TransferStack$SNode)
> @bci=34, line=242 (Compiled frame)
>  -
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
> boolean, long) @bci=268, line=344 (Compiled frame)
>  - java.util.concurrent.SynchronousQueue.offer(java.lang.Object) @bci=19,
> line=846 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor.execute(java.lang.Runnable)
> @bci=43, line=653 (Compiled frame)
>  -
> java.util.concurrent.AbstractExecutorService.submit(java.util.concurrent.Callable)
> @bci=20, line=92 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getCompactedRow(java.util.List)
> @bci=86, line=190 (Compiled frame) -
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getReduced()
> @bci=31, line=164 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Reducer.getReduced()
> @bci=1, line=144 (Compiled frame)
>  - org.apache.cassandra.utils.MergeIterator$ManyToOne.consume() @bci=88,
> line=116 (Compiled frame)
>  - org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext()
> @bci=5, line=99 (Compiled frame)
>  - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
> line=140 (Compiled frame)
>  - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
> (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext()
> @bci=4, line=103 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext()
> @bci=1, line=90 (Compiled frame)
>  - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
> line=140 (Compiled frame)
>  - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
> (Compiled frame)
>  - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=614
> (Compiled frame)
>  - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
> line=140 (Compiled frame)
>  - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=135
> (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.CompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=772, line=172 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=2, line=57 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=117,
> line=134 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=1,
> line=114 (Interpreted frame)
>  - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303
> (Compiled frame)
>  - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Compiled frame)
>  -
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
> @bci=59, line=886 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908
> (Compiled frame)
>  - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)
>
> Thread 20269: (state = BLOCKED)
>  - org.apache.cassandra.utils.obs.OpenBitSet.<init>(long, boolean)
> @bci=51, line=104 (Compiled frame)
>  - org.apache.cassandra.utils.obs.OpenBitSet.<init>(long) @bci=3, line=92
> (Compiled frame)
>  - org.apache.cassandra.utils.BloomFilter.bucketsFor(long, int) @bci=12,
> line=54 (Compiled frame)
>  - org.apache.cassandra.utils.BloomFilter.getFilter(long, int) @bci=110,
> line=73 (Compiled frame)
>  -
> org.apache.cassandra.db.ColumnIndexer.serialize(org.apache.cassandra.io.util.IIterableColumns)
> @bci=10, line=83 (Compiled frame)
>  -
> org.apache.cassandra.db.ColumnIndexer.serialize(org.apache.cassandra.io.util.IIterableColumns,
> java.io.DataOutput) @bci=5, line=51 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.PrecompactedRow.write(java.io.DataOutput)
> @bci=42, line=140 (Compiled frame)
>  -
> org.apache.cassandra.io.sstable.SSTableWriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow)
> @bci=43, line=160 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.CompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=685, line=158 (Compiled frame)
>  -
> org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=2, line=57 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=117,
> line=134 (Interpreted frame)
>  - org.apache.cassandra.db.compaction.CompactionManager$1.call() @bci=1,
> line=114 (Interpreted frame)
>  - java.util.concurrent.FutureTask$Sync.innerRun() @bci=30, line=303
> (Compiled frame)
>  - java.util.concurrent.FutureTask.run() @bci=4, line=138 (Compiled frame)
>  -
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
> @bci=59, line=886 (Compiled frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=28, line=908
> (Compiled frame)
>  - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)
>
>
> BRs
> //Tang Weiqiang
>
>

Mime
View raw message