accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: [DISCUSS] API changes to provide resource cleanup
Date Thu, 02 Jan 2014 18:51:13 GMT
I agree with Keith. Thanks for summarizing, Sean.

I also favor option #2 for all existing versions, up through 1.6.0.

For 1.7.0, I strongly favor a new client API that addresses lifecycle
management of connection resources directly in the API. Specifically,
I propose moving static connection state to a ConnectionResources
object that is Closeable and can be provided to an instance. For
backwards compatibility, the implementation of the current API can be
made to use a singleton instance of this object, rather than static
state. It follows then, that "The Hammer" solution (#2) would simply
be modified in its implementation to close this singleton instance,
for backwards compatibility with earlier iterations of "The Hammer".

Christopher L Tubbs II

On Thu, Jan 2, 2014 at 1:27 PM, Keith Turner <> wrote:
> I really like the summary of the discussion, very thorough and concise.  I
> am in favor of #2 for 1.4.5, 1.5.1, and 1.6.0.  Also I would be willing to
> do the revert work for close().
> If we go with option #2, what should we do for 1.7.0-SNAPSHOT?  If someone
> really wants to pursue adding close, we could leave things as is in
> 1.7.0-SNAPSHOT.  If no one is going to pursue it, then we should revert it
> in 1.7.0-SNAP rather than leave something thats partially done.
> On Thu, Jan 2, 2014 at 12:47 PM, Sean Busbey <>wrote:
>> Hey Folks!
>> We need to come to some conclusions on what we're going to do for resource
>> clean up. I'll attempt to summarize the situation and various options. If I
>> missed something from our myriad of tickets and mailing list threads,
>> please bring it up.
>> Brief Background:
>> The existing client APIs presume that a large amount of global state will
>> persist for the duration of a JVM instance. This is at odds with lifecycle
>> management in application containers, where a JVM is very long lived and
>> user provided applications are stood up and torn down. We have reports of
>> this causing OOM on JBoss[1] and leaked threads on Tomcat[2].
>> We have two possible solutions, both of which Jared Winick has kindly
>> verified solve the problem, as seen on JBoss.
>> ----
>> = Proposed solution #1: Closeable Instance
>> The first approach adds a .close method to Instance so that users can say
>> when they are done with a given instance. Internally, reference counting
>> determines when we tear down global resources.
>> Advantages:
>>   * States via code where a client should do lifecycle management.
>>   * Allows shutting down just some of the resources used.
>>   * Is already in the code base.
>> Disadvantages:
>>   * Since lifecycle is getting added post-hoc, we are more likely to have
>> maintenance issues as we find other side effects we hadn't considered, like
>> the multithreaded issue that already came up[3].
>>   * Changes Instance from representing static configuration to shared state
>>   * Doesn't work with the fluent style some of our APIs encourage.
>>   * closed semantics probably aren't consistently enforced (e.g. users
>> trying to use a BatchWriter that came from a now-closed instance should
>> fail)
>> To finish, we'd need to
>>   * Verify multithreaded handling is done without too much of a performance
>> impact[3]
>>   * Finish making our internal use consistent with the lifecycle we're
>> telling others to use[4]
>>   * Possibly add tests to verify consistent enforcement of closing on
>> objects derived from Instance
>> = Proposed solution #2: Global cleanup utility, aka The Hammer
>> As a band-aid to allow for "unload resources" without making changes to the
>> API we instead provide a utility method that cleans up all global
>> resources.
>> Advantages:
>>   * Doesn't change API or meaning for Instance
>>   * Can be used on older Accumulo deployments w/o patch/rebuild cycle
>> Disadvantages:
>>   * Only allows all-or-nothing cleanup
>>   * Doesn't address our underlying lack of lifecycle
>>   * Requires reverts
>> To finish, we'd need to
>>   * revert commits from old solution (I haven't checked how many commits,
>> but it's 6 tickets :/ )
>>   * port code from PoC to main codebase (asf grants, etc) [6]
>>   * add some kind of test (functional/IT?)
>> -----
>> We need to decide what we're going to provide as a placeholder for releases
>> already frozen on API (i.e. 1.4, 1.5, 1.6*) as well as longer term.
>> Personally, my position is that we should use the simplest change to handle
>> the published versions (solution #2).
>> Obviously there are outstanding issues with how we deal with global state
>> and shared resources in the current client APIs. I'd like to see that
>> addressed as a part of a more coherent client lifecycle rather than
>> struggling to make it work while maintaining the current API. Long term, I
>> think this means handling things in the updated client API Christopher has
>> mentioned a few times.
>> How close to consensus are we already?
>> - Sean
>> [1]:
>> [2]:
>> [3]:
>> [4]:
>> [6]:
>> *: I think 1.6 should be in this list because we are at feature freeze, and
>> any proper changes to lifecycle management are likely to constitute an
>> addition that wouldn't pass that restriction. Mike Drob suggested in chat
>> that given the current state of 1.6 a more intrusive fix might be
>> acceptable.

View raw message