Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0903111EA for ; Fri, 28 Mar 2014 17:02:23 +0000 (UTC) Received: (qmail 95941 invoked by uid 500); 28 Mar 2014 17:02:22 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 95671 invoked by uid 500); 28 Mar 2014 17:02:20 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 95591 invoked by uid 99); 28 Mar 2014 17:02:18 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2014 17:02:18 +0000 Received: from localhost (HELO mail-lb0-f177.google.com) (127.0.0.1) (smtp-auth username ctubbsii, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2014 17:02:17 +0000 Received: by mail-lb0-f177.google.com with SMTP id z11so3852757lbi.22 for ; Fri, 28 Mar 2014 10:02:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=J46/w3T7pRm3xUTTlCmmUd6Qd+q/jP2g02X+bCfsB2M=; b=HQlvlqIogmnBCsYOa4BBFlh1rBnbfBN4hKQ20C9cK+/WMXHvi3+4HXT8yoncBFmCrD +eH1s5S0YRee9wohVDzseRqDWv8dRZlnHJ2xhcICWjHOMMy9UCvu8c44UW9DvCGp+mqM pt4JuCn1HeKcOZ2ESo4tvlp5D5JCoHjtr7eLnez8ughkp6fpS4hu56xmolfG6RVxml5U Rxze4WH1ECKxewJwK8yYbiUUnspWwAkPpP0t0kyzib1w4kieNSquhUJ1odK3/z+V5iA7 opho4gnsmtQMv4MvhdJDJMuvjOYmYbDhDYSCrKefhO5ncL+2uO4eQIiOQjayDtUhBcrZ TS6w== MIME-Version: 1.0 X-Received: by 10.112.47.3 with SMTP id z3mr2377981lbm.34.1396026135944; Fri, 28 Mar 2014 10:02:15 -0700 (PDT) Received: by 10.114.177.201 with HTTP; Fri, 28 Mar 2014 10:02:15 -0700 (PDT) In-Reply-To: <5335A23A.6070204@gmail.com> References: <5332FC76.3000604@gmail.com> <5332FFD0.5030304@gmail.com> <53330473.7000700@gmail.com> <53331700.6080407@gmail.com> <53331BC0.1020205@gmail.com> <533336BB.7000403@gmail.com> <533356AD.2090503@gmail.com> <5334B8E8.1080205@gmail.com> <5335A23A.6070204@gmail.com> Date: Fri, 28 Mar 2014 13:02:15 -0400 Message-ID: Subject: Re: [DISCUSS] MiniAccumuloCluster goals and approach From: Christopher To: Accumulo Dev List Content-Type: text/plain; charset=UTF-8 I don't think any of this should be done for 1.6.0, but I like the idea of creating a separate cluster interface for testing. I think it should be integrated into the accumulo-maven-plugin, also. I think the idea should be hammered out, and tested as a separate thing, to experiment with the options, and provided as a complete feature for the next major release. If it would change packaging dependencies, it shouldn't even be done for 1.6.x bugfix releases. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser wrote: > Oh, I like that idea, Bill & Sean. > > Package: org.apache.accumulo.cluster > Public API: org.apache.accumulo.cluster.AccumuloCluster > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements > AccumuloCluster, allows for backwards compat) > Yarn: org.apache.accumulo.cluster.yarn > Docker: ... > Mesos: ... > > etc etc etc. > > One question in my mind, do we keep the maven module 'accumulo-minicluster'? > I would imagine that if we struck the 'mini' portion from 1.6 that would > create some confusion. Would it be worth the indirection to rename > accumulo-minicluster to accumulo-cluster and then create a new > accumulo-minicluster module that depends on accumulo-minicluster (but > contains no code itself) to preserve the 1.4 and 1.5 poms to generally work > with a version bump? I'm not sure if Maven would be happy with that or do > what I think it "should". > > > On 3/28/14, 6:26 AM, Bill Havanki wrote: >> >> I've been watching the conversation on the side, but I wanted to mention >> that it seems the focus isn't so much on "mini" clusters anymore. You're >> thinking of programmatic cluster management, whether one node or many. The >> idea of a basic cluster management interface, with MAC as an >> implementation, is promising. A package name of just "cluster" could work. >> >> Carry on :) >> >> Bill H >> >> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey >> wrote: >> >>> If you decide to go the mapred/mapreduce way, you could go with the >>> package >>> name "mini". >>> >>> alternatively, we can do a multi-stage change out >>> >>> 1) 1.6.x: introduce TestAccumuloCluster interface, @deprecate >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster >>> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends >>> TestAccumuloCluster, @deprecate TestAccumuloCluster >>> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster >>> >>> Or just go with TestAccumuloCluster as the interface, have >>> MiniAccumuloCluster as the local pseudo distributed implementation, and >>> then call your new one something like YarnAccumuloCluster. >>> >>> In that case we could use the deprecation cycle to move the MAC class out >>> of the public api. >>> >>> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser wrote: >>> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate >>>> future cruft? >>>> >>>> Suggestions on the new package and/or class name would be greatly >>>> appreciated over "NewMiniAccumuloC*". >>>> >>>> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote: >>>> >>>>> Those who are interested: check out >>>>> https://github.com/joshelser/accumulo/commit/ >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8 >>>>> >>>>> >>>>> tl;dr I could create some real interfaces for the cluster and config, >>>>> which are "hidden" under the covers by the 1.4 and 1.5 >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the >>>>> default implementation, gives us the ability to hide "implementation >>>>> details" if wanted, and moves us towards some factory methods instead >>>>> of >>>>> calling a class directly. >>>>> >>>>> Thoughts? >>>>> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote: >>>>> >>>>>> Yes, very much experimental at this point. >>>>>> >>>>>> What I'm most concerned about is having reasonable hooks up front, not >>>>>> trying to make an implementation for inclusion 1.6.0. >>>>>> >>>>>> Regarding additions, the implementations already contains most things >>>>>> I >>>>>> would want to expose. I haven't come up with anything that would be >>>>>> generally returned through the "API" rather than through this proposed >>>>>> implementation (e.g. YARN connection information) >>>>>> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote: >>>>>> >>>>>>> What you are trying to do sounds interesting. It also sounds >>>>>>> experimental >>>>>>> and in the early stages. Is there anything specific you think >>>>>>> should be >>>>>>> done for 1.6.0 w/ regards to MAC API? >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser >>>>>>> wrote: >>>>>>> >>>>>>> On 3/26/14, 11:13 AM, Keith Turner wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On 3/26/14, 10:57 AM, Keith Turner wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Can you give an example of what you are thinking of? I don't >>>>>>>>>> understand >>>>>>>>>> >>>>>>>>>>> you >>>>>>>>>>> viewpoint either >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Sure. One limitation of MAC, in general as a testing harness, >>>>>>>>>>> is >>>>>>>>>> >>>>>>>>>> that it >>>>>>>>>> doesn't adequately exercise multi-node implementations. You can >>>>>>>>>> run >>>>>>>>>> multiple tservers, but they are all on the same host which limits >>> >>> the >>>>>>>>>> >>>>>>>>>> validity of a "robust" test. This is my immediate goal. >>>>>>>>>> >>>>>>>>>> Multi-node deployments are capable using something like Mesos or >>>>>>>>>> Yarn. >>>>>>>>>> Given that there is already functioning support to deploy Accumulo >>> >>> on >>>>>>>>>> >>>>>>>>>> Yarn, >>>>>>>>>> this was my goal. >>>>>>>>>> >>>>>>>>>> My goal is to be able to have the ability to run all of our >>>>>>>>>> AbstractMacIT >>>>>>>>>> implementations against "real" hardware without changing a single >>>>>>>>>> line of >>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC >>>>>>>>>> implementation). The point is, I believe there could be a huge >>>>>>>>>> testing >>>>>>>>>> gain >>>>>>>>>> from being able to write tests which leverage yarn, have the same >>>>>>>>>> programmatic configuration API from MAC, and provide near "real" >>>>>>>>>> Accumulo >>>>>>>>>> semantics. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ok so you want to MAC to be an interface so that you can provide >>>>>>>>>> a >>>>>>>>> >>>>>>>>> completely different implementation? >>>>>>>>> >>>>>>>>> >>>>>>>>> Correct. Some things would serve well in a common abstract base >>> >>> (e.g. >>>>>>>> >>>>>>>> numTservers, siteXml configuration), but all the nonsense about >>>>>>>> creating >>>>>>>> directory structures and managing Processes is implementation >>> >>> specific. >>>>>>>> >>>>>>>> >>>>>>>> Perhaps I could create a new interface that the current >>> >>> implementation >>>>>>>> >>>>>>>> implements which still provides the same semantics from 1.4 and 1.5. >>>>>>>> Let me >>>>>>>> see if I can mock up what I'm thinking -- that will probably be >>>>>>>> easier than >>>>>>>> me trying to write it out. >>>>>>>> >>>>>>>> >>>>>>> >>> >> >> >> >