Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F03E010B29 for ; Thu, 2 Jan 2014 23:20:31 +0000 (UTC) Received: (qmail 13225 invoked by uid 500); 2 Jan 2014 23:20:31 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 13184 invoked by uid 500); 2 Jan 2014 23:20:31 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 13176 invoked by uid 99); 2 Jan 2014 23:20:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 23:20:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of busbey@cloudera.com designates 209.85.128.52 as permitted sender) Received: from [209.85.128.52] (HELO mail-qe0-f52.google.com) (209.85.128.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jan 2014 23:20:25 +0000 Received: by mail-qe0-f52.google.com with SMTP id ne12so15017948qeb.11 for ; Thu, 02 Jan 2014 15:20:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=BkH0+Tt/bWquZmo5hfyE5VNLqQReCZ1TB55hogaV51c=; b=Kh9qFTFhlr1uGQjJp3aBtrbZO/ZLi2pUzA28fyVgT/NFuF8RsJlsPSAPJj4N0Nk0Sw s7gdX8mSlCHv+ll6jTewCXVVaA8S47CLI2/a5SYvCCJwNzE05CS7yCYw3YLlfQXNkvaS /tATU5AgIFAulcP7DLdlTjfOiQrKYtcWQLjdhNToWGFNqzOAVGmE7GDMU5xOKun6j+eg ulRMgDx4A45ggvd+EXw22/4GP4MHUtF0rYFEzSY5lKA7tTw0qIJ261b7SyvFyq+V5Mk4 boLHzncjOHJgp5uybXGLCoO3kcv0kg+zNt8ZexKpV4vFcvL8BiXDXGpnLebBzq9DYVSa fUgg== X-Gm-Message-State: ALoCoQmWGnHx+Lh7HTfByr8DdzhbSa2AqYRQTxraVj3oWCIF7lEMfs1Dx7A9Z6V1lYC27kAMwMco X-Received: by 10.49.127.205 with SMTP id ni13mr148828015qeb.40.1388704804862; Thu, 02 Jan 2014 15:20:04 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.232.67 with HTTP; Thu, 2 Jan 2014 15:19:44 -0800 (PST) In-Reply-To: References: <52C5C21E.4000504@gmail.com> From: Sean Busbey Date: Thu, 2 Jan 2014 17:19:44 -0600 Message-ID: Subject: Re: [DISCUSS] API changes to provide resource cleanup To: "dev@accumulo apache. org" , Keith Turner Content-Type: multipart/alternative; boundary=047d7b6d80aa978c3004ef0508fd X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6d80aa978c3004ef0508fd Content-Type: text/plain; charset=UTF-8 FYI, I filed ACCUMULO-2128 to track this https://issues.apache.org/jira/browse/ACCUMULO-2128 On Thu, Jan 2, 2014 at 5:04 PM, Sean Busbey wrote: > Excellent! it sounds like we have consensus. > > Keith, I think you have volunteered for both the revert work and > reimplementing the PoC Hammer. > > Do you want help in either of those cases? Help with a functional test, > docs? > > > On Thu, Jan 2, 2014 at 5:00 PM, William Slacum < > wilhelm.von.cloud@accumulo.net> wrote: > >> Voting for the hammer/hacksawjimdugging. I like the concept of being to >> track resources and clean them up, but the back end code isn't designed to >> deal with an instance in the way we're trying to model it. >> >> >> On Thu, Jan 2, 2014 at 2:46 PM, Josh Elser wrote: >> >> > Bill Slacum and I had talked about unexpected breakages in API for >> clients >> > and internal by modifying ZooKeeperInstance (I think I might have >> mentioned >> > it already on one of the tickets). >> > >> > Considering some of the other work that Mike has started on in regards >> to >> > making an easier-to-use client API, Bill and I mused over an >> > InstanceFactory notion which could wrap different Instance >> implementations >> > for the various deployment requirements. We could leave the current ZKI >> > (close to?) how it works now, lift the non thread-safe pieces to a >> common >> > parent, and create some sort of ThreadsafeZKI. >> > >> > Obviously this is very hand-wavy, but I'm definitely leery to changing >> the >> > default impl for something so prevalent as ZKI. Thinking about the >> problem >> > with a clean slate seems best to me. >> > >> > >> > On 1/2/14, 1:36 PM, Eric Newton wrote: >> > >> >> All of our current code treats the Instance like a simple record: >> >> >> >> * immutable, and therefore >> >> * thread-safe >> >> * provides several fields that describe an instance >> >> >> >> When I tried to add calls to close() in our own code, I found that our >> >> disregard for the lifetime of an instance was implicit, and probably >> is in >> >> all our user's code, too. >> >> >> >> I think if we want to do something like #1, we'll have to do so >> through a >> >> new API, and not by changing Instance, and then deprecate Instance. >> The >> >> mental model is just completely different. >> >> >> >> -Eric >> >> >> >> >> >> On Thu, Jan 2, 2014 at 12:47 PM, Sean Busbey < >> busbey+lists@cloudera.com> >> >> wrote: >> >> >> >> Hey Folks! >> >>> >> >>> We need to come to some conclusions on what we're going to do for >> >>> resource >> >>> clean up. I'll attempt to summarize the situation and various options. >> >>> If I >> >>> missed something from our myriad of tickets and mailing list threads, >> >>> please bring it up. >> >>> >> >>> Brief Background: >> >>> >> >>> The existing client APIs presume that a large amount of global state >> will >> >>> persist for the duration of a JVM instance. This is at odds with >> >>> lifecycle >> >>> management in application containers, where a JVM is very long lived >> and >> >>> user provided applications are stood up and torn down. We have >> reports of >> >>> this causing OOM on JBoss[1] and leaked threads on Tomcat[2]. >> >>> >> >>> We have two possible solutions, both of which Jared Winick has kindly >> >>> verified solve the problem, as seen on JBoss. >> >>> >> >>> ---- >> >>> = Proposed solution #1: Closeable Instance >> >>> >> >>> The first approach adds a .close method to Instance so that users can >> say >> >>> when they are done with a given instance. Internally, reference >> counting >> >>> determines when we tear down global resources. >> >>> >> >>> Advantages: >> >>> * States via code where a client should do lifecycle management. >> >>> * Allows shutting down just some of the resources used. >> >>> * Is already in the code base. >> >>> >> >>> Disadvantages: >> >>> * Since lifecycle is getting added post-hoc, we are more likely to >> >>> have >> >>> maintenance issues as we find other side effects we hadn't considered, >> >>> like >> >>> the multithreaded issue that already came up[3]. >> >>> * Changes Instance from representing static configuration to shared >> >>> state >> >>> * Doesn't work with the fluent style some of our APIs encourage. >> >>> * closed semantics probably aren't consistently enforced (e.g. >> users >> >>> trying to use a BatchWriter that came from a now-closed instance >> should >> >>> fail) >> >>> >> >>> To finish, we'd need to >> >>> * Verify multithreaded handling is done without too much of a >> >>> performance >> >>> impact[3] >> >>> * Finish making our internal use consistent with the lifecycle >> we're >> >>> telling others to use[4] >> >>> * Possibly add tests to verify consistent enforcement of closing on >> >>> objects derived from Instance >> >>> >> >>> = Proposed solution #2: Global cleanup utility, aka The Hammer >> >>> >> >>> As a band-aid to allow for "unload resources" without making changes >> to >> >>> the >> >>> API we instead provide a utility method that cleans up all global >> >>> resources. >> >>> >> >>> Advantages: >> >>> * Doesn't change API or meaning for Instance >> >>> * Can be used on older Accumulo deployments w/o patch/rebuild cycle >> >>> >> >>> Disadvantages: >> >>> * Only allows all-or-nothing cleanup >> >>> * Doesn't address our underlying lack of lifecycle >> >>> * Requires reverts >> >>> >> >>> To finish, we'd need to >> >>> * revert commits from old solution (I haven't checked how many >> >>> commits, >> >>> but it's 6 tickets :/ ) >> >>> * port code from PoC to main codebase (asf grants, etc) [6] >> >>> * add some kind of test (functional/IT?) >> >>> >> >>> ----- >> >>> >> >>> We need to decide what we're going to provide as a placeholder for >> >>> releases >> >>> already frozen on API (i.e. 1.4, 1.5, 1.6*) as well as longer term. >> >>> >> >>> Personally, my position is that we should use the simplest change to >> >>> handle >> >>> the published versions (solution #2). >> >>> >> >>> Obviously there are outstanding issues with how we deal with global >> state >> >>> and shared resources in the current client APIs. I'd like to see that >> >>> addressed as a part of a more coherent client lifecycle rather than >> >>> struggling to make it work while maintaining the current API. Long >> term, >> >>> I >> >>> think this means handling things in the updated client API Christopher >> >>> has >> >>> mentioned a few times. >> >>> >> >>> How close to consensus are we already? >> >>> >> >>> - Sean >> >>> >> >>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-1379 >> >>> [2]: https://issues.apache.org/jira/browse/ACCUMULO-1697 >> >>> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2027 >> >>> [4]: https://issues.apache.org/jira/browse/ACCUMULO-1923 >> >>> [6]: https://issues.apache.org/jira/browse/ACCUMULO-2113 >> >>> >> >>> *: I think 1.6 should be in this list because we are at feature >> freeze, >> >>> and >> >>> any proper changes to lifecycle management are likely to constitute an >> >>> addition that wouldn't pass that restriction. Mike Drob suggested in >> chat >> >>> that given the current state of 1.6 a more intrusive fix might be >> >>> acceptable. >> >>> >> >>> >> >> >> > > --047d7b6d80aa978c3004ef0508fd--