cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-9423) Improve Leak Detection to cover strong reference leaks
Date Fri, 26 Jun 2015 12:04:04 GMT


Benedict commented on CASSANDRA-9423:

So, it looks like we were creating strong circular reference leaks already in 2.1, severely
damaging the utility of the leak detection. I've pushed a patch [here|]

# fixes these circular references;
# introduces two kinds of leak detection:
## Detects circular references directly, by periodically walking the object graph of the ref
state objects; and
## Detects potential strong leak candidates obliquely, by constructing the total set of expected
ref objects, and comparing them to those that are actually extant; if an unexpected object
remains extant across two such runs (at fifteen minute intervals) it is reported as a leak

This last check is *not* perfect, as we could construct objects that we haven't yet made visible
in the tracker, for instance, but generally they should not remain invisible for fifteen minutes.
We can follow up with some improvements to further guarantee this, but it should be good enough
for now.

Both are only run when {{-Dcassandra.debugrefcount=true}}, so this will not in any way affect
production systems.

I've tagged as 2.1, as the circular reference leaks affect it, and besides that the only changes
are a no-op for production systems.

It's worth recording that anonymous classes are *never* static, even if they require no handle
to their enclosing class, and this was the source of a majority of circular references. But
there were others also that I had not expected after fixing this, that were also detected
by this debugging.

> Improve Leak Detection to cover strong reference leaks
> ------------------------------------------------------
>                 Key: CASSANDRA-9423
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Priority: Critical
>             Fix For: 2.1.8
> Currently we detect resources that we don't cleanup that become unreachable. We could
also detect references that appear to have leaked without becoming unreachable, by periodically
scanning the set of extant refs, and checking if they are reachable via their normal means
(if any); if their lifetime is unexpectedly long this likely indicates a problem, and we can
log a warning/error.
> Assigning to myself to not forget it, since this may well help especially with [~tjake]'s
concerns highlighted on 8099 for 3.0.

This message was sent by Atlassian JIRA

View raw message