cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Meredith (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-15170) Reduce the time needed to release in-JVM dtest cluster resources after close
Date Sun, 04 Aug 2019 22:42:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-15170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899690#comment-16899690
] 

Jon Meredith commented on CASSANDRA-15170:
------------------------------------------

I've updated the branches and this should be ready to review.  Once you're happy with it we
can update the commit message and squash the fixup in,
I just didn't have the heart to redo all the merging up again.

2.2 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-2.2] | [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-2%2E2]
 3.0 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-3.0]
| [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-3%2E0]
 3.11 | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-3.11]
| [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-3%2E11]
 trunk | [Branch|https://github.com/jonmeredith/cassandra/commits/in-jvm-dtest-fixes-v4-trunk]
| [CircleCI|https://circleci.com/gh/jonmeredith/cassandra/tree/in-jvm-dtest-fixes-v4-trunk]

Unit tests / in-jvm-dtest are passing on 2.2-3.11 successfully.  There's a failure on trunk
for {{org.apache.cassandra.net.ConnectionTest.testCloseIfEndpointDown}} which I suspect is
due to the growth in the powerset of connection options and unrelated to the in-jvm changes.

To document the discussion we had off-ticket.

{quote}
Making {{ResourceLeakTest.doTest}} to be configurable, could also later automatically loop
through all in-jvm dtests and run them a dozen times or so to see if leaks are occurring.
Perhaps on each loop, we could dump the threads, heap utilisation and files, and check they
are not growing? That way the test can become one that actually fails if leaks are detected,
and not produce heap dumps etc. unless it is so detected (and perhaps preferably only produce
heap dumps if no thread leaks are detected)
{quote}

I agree, that would be nice.  I'd rather tackle that as a separate piece of work under a new
ticket (it may make sense to do at the same time as CASSANDRA-15171. It's painful trying to
keep all the variations of this in sync at the moment.

{quote}
IsolatedExecutor not using NamedThreadFactory
{quote}

I added a comment to explain, but using NamedThreadFactory was obscuring some exceptions while
debugging as it sometimes called lways called FastThreadLocal.removeAll() before it was initialized
and crashed (although perhaps with moving unloading the classloader it would not be an issue
now, I can't remember how to reproduce).

{quote}
I'm anyway unclear why we are using `CompletableFuture` here, when we return a normal `Future`
{quote}

Good point, fixed up with your suggestion.


> Reduce the time needed to release in-JVM dtest cluster resources after close
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15170
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15170
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Test/dtest
>            Reporter: Jon Meredith
>            Assignee: Jon Meredith
>            Priority: Normal
>
> There are a few issues that slow the in-JVM dtests from reclaiming metaspace once the
cluster is closed.
> IsolatedExecutor issues the shutdown on a SingleExecutorThreadPool, sometimes this thread
was still running 10s after the dtest cluster was closed.  Instead, switch to a ThreadPoolExecutor
with a core pool size of 0 so that the thread executing the class loader close executes sooner.
> If an OutboundTcpConnection is waiting to connect() and the endpoint is not answering,
it has to wait for a timeout before it exits. Instead it should check the isShutdown flag
and terminate early if shutdown has been requested.
> In 3.0 and above, HintsCatalog.load uses java.nio.Files.list outside of a try-with-resources
construct and leaks a file handle for the directory.  This doesn't matter for normal usage,
it leaks a file handle for each dtest Instance created.
> On trunk, Netty global event executor threads are still running and delay GC for the
instance class loader.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message