accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <>
Subject Re: [DISCUSS] API changes to provide resource cleanup
Date Thu, 02 Jan 2014 18:36:24 GMT
All of our current code treats the Instance like a simple record:

* immutable, and therefore
* thread-safe
* provides several fields that describe an instance

When I tried to add calls to close() in our own code, I found that our
disregard for the lifetime of an instance was implicit, and probably is in
all our user's code, too.

I think if we want to do something like #1, we'll have to do so through a
new API, and not by changing Instance, and then deprecate Instance.  The
mental model is just completely different.


On Thu, Jan 2, 2014 at 12:47 PM, Sean Busbey <>wrote:

> Hey Folks!
> We need to come to some conclusions on what we're going to do for resource
> clean up. I'll attempt to summarize the situation and various options. If I
> missed something from our myriad of tickets and mailing list threads,
> please bring it up.
> Brief Background:
> The existing client APIs presume that a large amount of global state will
> persist for the duration of a JVM instance. This is at odds with lifecycle
> management in application containers, where a JVM is very long lived and
> user provided applications are stood up and torn down. We have reports of
> this causing OOM on JBoss[1] and leaked threads on Tomcat[2].
> We have two possible solutions, both of which Jared Winick has kindly
> verified solve the problem, as seen on JBoss.
> ----
> = Proposed solution #1: Closeable Instance
> The first approach adds a .close method to Instance so that users can say
> when they are done with a given instance. Internally, reference counting
> determines when we tear down global resources.
> Advantages:
>   * States via code where a client should do lifecycle management.
>   * Allows shutting down just some of the resources used.
>   * Is already in the code base.
> Disadvantages:
>   * Since lifecycle is getting added post-hoc, we are more likely to have
> maintenance issues as we find other side effects we hadn't considered, like
> the multithreaded issue that already came up[3].
>   * Changes Instance from representing static configuration to shared state
>   * Doesn't work with the fluent style some of our APIs encourage.
>   * closed semantics probably aren't consistently enforced (e.g. users
> trying to use a BatchWriter that came from a now-closed instance should
> fail)
> To finish, we'd need to
>   * Verify multithreaded handling is done without too much of a performance
> impact[3]
>   * Finish making our internal use consistent with the lifecycle we're
> telling others to use[4]
>   * Possibly add tests to verify consistent enforcement of closing on
> objects derived from Instance
> = Proposed solution #2: Global cleanup utility, aka The Hammer
> As a band-aid to allow for "unload resources" without making changes to the
> API we instead provide a utility method that cleans up all global
> resources.
> Advantages:
>   * Doesn't change API or meaning for Instance
>   * Can be used on older Accumulo deployments w/o patch/rebuild cycle
> Disadvantages:
>   * Only allows all-or-nothing cleanup
>   * Doesn't address our underlying lack of lifecycle
>   * Requires reverts
> To finish, we'd need to
>   * revert commits from old solution (I haven't checked how many commits,
> but it's 6 tickets :/ )
>   * port code from PoC to main codebase (asf grants, etc) [6]
>   * add some kind of test (functional/IT?)
> -----
> We need to decide what we're going to provide as a placeholder for releases
> already frozen on API (i.e. 1.4, 1.5, 1.6*) as well as longer term.
> Personally, my position is that we should use the simplest change to handle
> the published versions (solution #2).
> Obviously there are outstanding issues with how we deal with global state
> and shared resources in the current client APIs. I'd like to see that
> addressed as a part of a more coherent client lifecycle rather than
> struggling to make it work while maintaining the current API. Long term, I
> think this means handling things in the updated client API Christopher has
> mentioned a few times.
> How close to consensus are we already?
> - Sean
> [1]:
> [2]:
> [3]:
> [4]:
> [6]:
> *: I think 1.6 should be in this list because we are at feature freeze, and
> any proper changes to lifecycle management are likely to constitute an
> addition that wouldn't pass that restriction. Mike Drob suggested in chat
> that given the current state of 1.6 a more intrusive fix might be
> acceptable.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message