cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Low (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6818) SSTable references not released if stream session fails before it starts
Date Tue, 11 Mar 2014 10:19:55 GMT


Richard Low commented on CASSANDRA-6818:

I looked at the 1.2 patch, it looks fine. I'll see if I can reproduce the original issue to

In StreamInSession.get, there is a minor memory leak - if another thread simultaneously creates
the same session, the one that is discarded remains registered with the gossiper. This was
present before, but we could easily fix it in this patch by delaying the registration until
after the putIfAbsent succeeds.

> SSTable references not released if stream session fails before it starts
> ------------------------------------------------------------------------
>                 Key: CASSANDRA-6818
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Richard Low
>            Assignee: Yuki Morishita
>             Fix For: 1.2.16, 2.0.7, 2.1 beta2
>         Attachments: 6818-1.2.txt, 6818-2.0.txt
> I observed a large number of 'orphan' SSTables - SSTables that are in the data directory
but not loaded by Cassandra - on a 1.1.12 node that had a large stream fail before it started.
These orphan files are particularly dangerous because if the node is restarted and picks up
these SSTables it could bring data back to life if tombstones have been GCed. To confirm the
SSTables are orphan, I created a snapshot and it didn't contain these files. I can see in
the logs that they have been compacted so should have been deleted.
> The log entries for the stream are:
> {{INFO [StreamStage:1] 2014-02-21 19:41:48,742 (line 115) Beginning transfer
to /}}
> {{INFO [StreamStage:1] 2014-02-21 19:41:48,743 (line 96) Flushing memtables
for [CFS(Keyspace='ks', ColumnFamily='cf1'), CFS(Keyspace='ks', ColumnFamily='cf2')]...}}
> {{ERROR [GossipTasks:1] 2014-02-21 19:41:49,239 (line 113)
Stream failed because / died or was restarted/removed (streams may still be active
in background, but further streams won't be started)}}
> {{INFO [StreamStage:1] 2014-02-21 19:41:51,783 (line 161) Stream context
metadata [...] 2267 sstables.}}
> {{INFO [StreamStage:1] 2014-02-21 19:41:51,789 (line 182) Streaming
to /}}
> {{INFO [Streaming to /] 2014-02-21 19:42:02,218 (line 99)
Found no stream out session at end of file stream task - this is expected if the receiver
went down}}
> After digging in the code, here's what I think the issue is:
> 1. StreamOutSession.transferRanges() creates a streaming session, which is registered
with the failure detector in AbstractStreamSession's constructor.
> 2. Memtables are flushed, potentially taking a long time.
> 3. The remote node fails, convict() is called and the StreamOutSession is closed. However,
at this time StreamOutSession.files is empty because it's still waiting for the memtables
to flush.
> 4. Memtables finish flusing, references are obtained to SSTables to be streamed and the
PendingFiles are added to StreamOutSession.files.
> 5. The first stream fails but the StreamOutSession isn't found so is never closed and
the references are never released.
> This code is more or less the same on 1.2 so I would expect it to reproduce there. I
looked at 2.0 and can't even see where SSTable references are released when the stream fails.
> Some possible fixes for 1.1/1.2:
> 1. Don't register with the failure detector until after the PendingFiles are set up.
I think this is the behaviour in 2.0 but I don't know if it was done like this to avoid this
> 2. Detect the above case in (e.g.) StreamOutSession.begin() by noticing the session has
been closed with care to avoid double frees.
> 3. Add some synchronization so closeInternal() doesn't race with setting up the session.

This message was sent by Atlassian JIRA

View raw message