cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming
Date Fri, 03 Jun 2016 22:31:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15315001#comment-15315001
] 

Paulo Motta commented on CASSANDRA-11345:
-----------------------------------------

While the root cause of the reported issue is CASSANDRA-11840, as a consequence of that the
stream session failed while an sstable was being transferred, making its reference be released
by {{StreamTransferTask.abort()}} -> {{OutgoingFileMessage.complete()}} before the transfer
was complete, potentially causing the subsequent {{Memory was freed}} error if the sstable
is garbage collected.

The ideal solution is CASSANDRA-11956 (to interrupt the stream transfer task right away),
but while that is not in place we should release the sstable reference only after the ongoing
transfer is finished (even if the session is aborted). So the idea of the patch is to set
a {{transferring}} flag on {{OutgoingFileMessage}} while the file is being transferred, and
if the session is failed before that the reference is not released, only after the {{transferring}}
flag is unset at the end of the file transfer. I added a unit test that verifies that an sstable
is not unreferenced if it's being transferred and the stream session fails, only after the
transfer is finished.

I also added a ninja fix to cache the {{FileMessageHeader}} size to avoid paying a high cost
of recalculation of the size of large compressed files after the file is transferred on [StreamSession.fileSent(FileMessageHeader)|https://github.com/apache/cassandra/blob/9c8ee4c73f4e4e8d5b7693d34a0d6d6397418e90/src/java/org/apache/cassandra/streaming/StreamSession.java#L574]

Patch and tests available below:
||2.1||2.2||3.0||3.7||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-11345]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-11345]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-11345]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.7...pauloricardomg:3.7-11345]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-11345]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-11345-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11345-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11345-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.7-11345-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11345-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-11345-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11345-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11345-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.7-11345-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11345-dtest/lastCompletedBuild/testReport/]|

Would you mind reviewing [~yukim]? Thanks in advance!

commit info: minor conflicts all the way up to 3.7 that merges cleanly to trunk

> Assertion Errors "Memory was freed" during streaming
> ----------------------------------------------------
>
>                 Key: CASSANDRA-11345
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11345
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>            Reporter: Jean-Francois Gosselin
>            Assignee: Paulo Motta
>
> We encountered the following AssertionError (twice on the same node) during a repair
:
> On node /172.16.63.41
> {noformat}
> INFO  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 StreamResultFuture.java:180
- [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Session with /10.174.216.160 is complete
                                                                                         
             
> WARN  [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 StreamResultFuture.java:207
- [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Stream failed                           
> ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 StreamSession.java:505 - [Stream
#f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred                    
> java.lang.AssertionError: Memory was freed                                          
                                                                                         

>         at org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) ~[apache-cassandra-2.1.13.jar:2.1.13]
                                                  
>         at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) ~[apache-cassandra-2.1.13.jar:2.1.13]
                                                             
>         at org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247)
~[apache-cassandra-2.1.13.jar:2.1.13]                
>         at org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112)
~[apache-cassandra-2.1.13.jar:2.1.13]                                
>         at org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546)
~[apache-cassandra-2.1.13.jar:2.1.13]                                             
>         at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50)
~[apache-cassandra-2.1.13.jar:2.1.13]                      
>         at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
~[apache-cassandra-2.1.13.jar:2.1.13]                      
>         at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
~[apache-cassandra-2.1.13.jar:2.1.13]                                    
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
~[apache-cassandra-2.1.13.jar:2.1.13]           
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
~[apache-cassandra-2.1.13.jar:2.1.13]                   
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]                      
                                                                                         

> {noformat}     
> On node /10.174.216.160
>  
> {noformat}       
> ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 StreamSession.java:505 - [Stream
#f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred                          
> java.io.IOException: Connection reset by peer                                       
                                                                                      
>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65]       
                                                                                      
>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.7.0_65]
                                                                                     
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.7.0_65]   
                                                                                      
>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65]                   
                                                                                      
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) ~[na:1.7.0_65]
                                                                                  
>         at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.13.jar:2.1.13]                     
>         at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.13.jar:2.1.13]                                
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
[apache-cassandra-2.1.13.jar:2.1.13]        
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
[apache-cassandra-2.1.13.jar:2.1.13]                
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]                      
                                                                                      
> INFO  [STREAM-IN-/172.16.63.41] 2016-03-09 02:38:14,142 StreamResultFuture.java:180 -
[Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Session with /172.16.63.41 is complete
> WARN  [STREAM-IN-/172.16.63.41] 2016-03-09 02:38:14,142 StreamResultFuture.java:207 -
[Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Stream failed                         
> ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,143 StreamSession.java:505 - [Stream
#f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred                  
> java.io.IOException: Broken pipe                                                    
                                                                                      
>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65]       
                                                                                      
>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.7.0_65]
                                                                                     
>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.7.0_65]   
                                                                                      
>         at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65]                   
                                                                                      
>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) ~[na:1.7.0_65]
                                                                                  
>         at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
~[apache-cassandra-2.1.13.jar:2.1.13]                     
>         at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
~[apache-cassandra-2.1.13.jar:2.1.13]                                
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
[apache-cassandra-2.1.13.jar:2.1.13]        
>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
[apache-cassandra-2.1.13.jar:2.1.13]                
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]     
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message