cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuki Morishita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-3306) Error in LeveledCompactionStrategy
Date Wed, 24 Oct 2012 22:02:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483641#comment-13483641
] 

Yuki Morishita commented on CASSANDRA-3306:
-------------------------------------------

bq. But we close the session on convict, so shouldn't it start a new one?

Yes, StreamInSession gets closed and removed on convict _once_. But if GC pause happens in
the middle of streaming session, the node resumes streaming in the same session after GC.
Since resumed stream carries session ID that is once closed on receiver side, StreamInSession
is created again with the same old session ID and this time just 1 file to receive.
This continues again and again until source node's StreamingOutSession sends all files.
You can see this in receiver's log file like below:

{code}
INFO [Thread-50] 2012-10-20 13:13:26,574 StreamInSession.java (line 214) Finished streaming
session 10 from /10.xx.xx.xx
INFO [Thread-51] 2012-10-20 13:13:29,691 StreamInSession.java (line 214) Finished streaming
session 10 from /10.xx.xx.xx
INFO [Thread-52] 2012-10-20 13:13:32,957 StreamInSession.java (line 214) Finished streaming
session 10 from /10.xx.xx.xx
{code}

Duplication happens during this partially broken streaming session. Because StreamInSession
is removed after sending SESSION_FINISHED reply, and StreamOutSession keeps sending files,
sometimes the same StreamInSession instance receives more than 1 file and calls closeIfFinished
every time it received the file.
(Sorry, this is hard to explain in words.
https://github.com/apache/cassandra/blob/cassandra-1.1.6/src/java/org/apache/cassandra/streaming/StreamInSession.java#L181
this part is executed multiple times with _readers_ growing by received new file.)

So as Sylvain stated above, changing DataTracker.View's sstable to Set is one way to eliminate
duplicate reference and we should do it. In addition, I'm thinking not to create duplicate
StreamInSession by checking StreamHeader.pendingFiles because this field is only filled when
initiating streaming.
                
> Error in LeveledCompactionStrategy
> ----------------------------------
>
>                 Key: CASSANDRA-3306
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3306
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Radim Kolar
>            Assignee: Yuki Morishita
>         Attachments: 0001-CASSANDRA-3306-test.patch
>
>
> during stress testing, i always get this error making leveledcompaction strategy unusable.
Should be easy to reproduce - just write fast.
> ERROR [CompactionExecutor:6] 2011-10-04 15:48:52,179 AbstractCassandraDaemon.java (line
133) Fatal exception in thread Thread[CompactionExecutor:6,5,main]
> java.lang.AssertionError
> 	at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:580)
> 	at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546)
> 	at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268)
> 	at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232)
> 	at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960)
> 	at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199)
> 	at org.apache.cassandra.db.compaction.LeveledCompactionTask.execute(LeveledCompactionTask.java:47)
> 	at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131)
> 	at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:662)
> and this is in json data for table:
> {
>   "generations" : [ {
>     "generation" : 0,
>     "members" : [ 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472,
473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484 ]
>   }, {
>     "generation" : 1,
>     "members" : [ ]
>   }, {
>     "generation" : 2,
>     "members" : [ ]
>   }, {
>     "generation" : 3,
>     "members" : [ ]
>   }, {
>     "generation" : 4,
>     "members" : [ ]
>   }, {
>     "generation" : 5,
>     "members" : [ ]
>   }, {
>     "generation" : 6,
>     "members" : [ ]
>   }, {
>     "generation" : 7,
>     "members" : [ ]
>   } ]
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message