accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: [DISCUSS] MiniAccumuloCluster goals and approach
Date Fri, 28 Mar 2014 17:02:15 GMT
I don't think any of this should be done for 1.6.0, but I like the
idea of creating a separate cluster interface for testing. I think it
should be integrated into the accumulo-maven-plugin, also. I think the
idea should be hammered out, and tested as a separate thing, to
experiment with the options, and provided as a complete feature for
the next major release. If it would change packaging dependencies, it
shouldn't even be done for 1.6.x bugfix releases.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <josh.elser@gmail.com> wrote:
> Oh, I like that idea, Bill & Sean.
>
> Package: org.apache.accumulo.cluster
> Public API: org.apache.accumulo.cluster.AccumuloCluster
> MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> AccumuloCluster, allows for backwards compat)
> Yarn: org.apache.accumulo.cluster.yarn
> Docker: ...
> Mesos: ...
>
> etc etc etc.
>
> One question in my mind, do we keep the maven module 'accumulo-minicluster'?
> I would imagine that if we struck the 'mini' portion from 1.6 that would
> create some confusion. Would it be worth the indirection to rename
> accumulo-minicluster to accumulo-cluster and then create a new
> accumulo-minicluster module that depends on accumulo-minicluster (but
> contains no code itself) to preserve the 1.4 and 1.5 poms to generally work
> with a version bump? I'm not sure if Maven would be happy with that or do
> what I think it "should".
>
>
> On 3/28/14, 6:26 AM, Bill Havanki wrote:
>>
>> I've been watching the conversation on the side, but I wanted to mention
>> that it seems the focus isn't so much on "mini" clusters anymore. You're
>> thinking of programmatic cluster management, whether one node or many. The
>> idea of a basic cluster management interface, with MAC as an
>> implementation, is promising. A package name of just "cluster" could work.
>>
>> Carry on :)
>>
>> Bill H
>>
>>
>> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
>> <busbey+lists@cloudera.com>wrote:
>>
>>> If you decide to go the mapred/mapreduce way, you could go with the
>>> package
>>> name "mini".
>>>
>>> alternatively, we can do a multi-stage change out
>>>
>>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>>>
>>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
>>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>>>
>>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>>>
>>> Or just go with TestAccumuloCluster as the interface, have
>>> MiniAccumuloCluster as the local pseudo distributed implementation, and
>>> then call your new one something like YarnAccumuloCluster.
>>>
>>> In that case we could use the deprecation cycle to move the MAC class out
>>> of the public api.
>>>
>>>
>>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <josh.elser@gmail.com> wrote:
>>>
>>>> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
>>>> future cruft?
>>>>
>>>> Suggestions on the new package and/or class name would be greatly
>>>> appreciated over "NewMiniAccumuloC*".
>>>>
>>>>
>>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>>>>
>>>>> Those who are interested: check out
>>>>> https://github.com/joshelser/accumulo/commit/
>>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>>>>>
>>>>>
>>>>> tl;dr I could create some real interfaces for the cluster and config,
>>>>> which are "hidden" under the covers by the 1.4 and 1.5
>>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
>>>>> default implementation, gives us the ability to hide "implementation
>>>>> details" if wanted, and moves us towards some factory methods instead
>>>>> of
>>>>> calling a class directly.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>>>>>
>>>>>> Yes, very much experimental at this point.
>>>>>>
>>>>>> What I'm most concerned about is having reasonable hooks up front,
not
>>>>>> trying to make an implementation for inclusion 1.6.0.
>>>>>>
>>>>>> Regarding additions, the implementations already contains most things
>>>>>> I
>>>>>> would want to expose. I haven't come up with anything that would
be
>>>>>> generally returned through the "API" rather than through this proposed
>>>>>> implementation (e.g. YARN connection information)
>>>>>>
>>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>>>>>
>>>>>>> What you are trying to do sounds interesting.  It also sounds
>>>>>>> experimental
>>>>>>> and in the early stages.   Is there anything specific you think
>>>>>>> should be
>>>>>>> done for 1.6.0 w/ regards to MAC API?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <josh.elser@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <josh.elser@gmail.com>
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Can you give an example of what you are thinking
of? I don't
>>>>>>>>>> understand
>>>>>>>>>>
>>>>>>>>>>> you
>>>>>>>>>>> viewpoint either
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   Sure. One limitation of MAC, in general as
a testing harness,
>>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>> that it
>>>>>>>>>> doesn't adequately exercise multi-node implementations.
You can
>>>>>>>>>> run
>>>>>>>>>> multiple tservers, but they are all on the same host
which limits
>>>
>>> the
>>>>>>>>>>
>>>>>>>>>> validity of a "robust" test. This is my immediate
goal.
>>>>>>>>>>
>>>>>>>>>> Multi-node deployments are capable using something
like Mesos or
>>>>>>>>>> Yarn.
>>>>>>>>>> Given that there is already functioning support to
deploy Accumulo
>>>
>>> on
>>>>>>>>>>
>>>>>>>>>> Yarn,
>>>>>>>>>> this was my goal.
>>>>>>>>>>
>>>>>>>>>> My goal is to be able to have the ability to run
all of our
>>>>>>>>>> AbstractMacIT
>>>>>>>>>> implementations against "real" hardware without changing
a single
>>>>>>>>>> line of
>>>>>>>>>> test code (ok - maybe a line or two to do injection
of the MAC
>>>>>>>>>> implementation). The point is, I believe there could
be a huge
>>>>>>>>>> testing
>>>>>>>>>> gain
>>>>>>>>>> from being able to write tests which leverage yarn,
have the same
>>>>>>>>>> programmatic configuration API from MAC, and provide
near "real"
>>>>>>>>>> Accumulo
>>>>>>>>>> semantics.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   Ok so you want to MAC to be an interface so that
you can provide
>>>>>>>>>> a
>>>>>>>>>
>>>>>>>>> completely different implementation?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Correct. Some things would serve well in a common abstract
base
>>>
>>> (e.g.
>>>>>>>>
>>>>>>>> numTservers, siteXml configuration), but all the nonsense
about
>>>>>>>> creating
>>>>>>>> directory structures and managing Processes is implementation
>>>
>>> specific.
>>>>>>>>
>>>>>>>>
>>>>>>>> Perhaps I could create a new interface that the current
>>>
>>> implementation
>>>>>>>>
>>>>>>>> implements which still provides the same semantics from 1.4
and 1.5.
>>>>>>>> Let me
>>>>>>>> see if I can mock up what I'm thinking -- that will probably
be
>>>>>>>> easier than
>>>>>>>> me trying to write it out.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>
>>
>>
>

Mime
View raw message