accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: [DISCUSS] MiniAccumuloCluster goals and approach
Date Fri, 28 Mar 2014 16:24:26 GMT
Oh, I like that idea, Bill & Sean.

Package: org.apache.accumulo.cluster
Public API: org.apache.accumulo.cluster.AccumuloCluster
MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements 
AccumuloCluster, allows for backwards compat)
Yarn: org.apache.accumulo.cluster.yarn
Docker: ...
Mesos: ...

etc etc etc.

One question in my mind, do we keep the maven module 
'accumulo-minicluster'? I would imagine that if we struck the 'mini' 
portion from 1.6 that would create some confusion. Would it be worth the 
indirection to rename accumulo-minicluster to accumulo-cluster and then 
create a new accumulo-minicluster module that depends on 
accumulo-minicluster (but contains no code itself) to preserve the 1.4 
and 1.5 poms to generally work with a version bump? I'm not sure if 
Maven would be happy with that or do what I think it "should".

On 3/28/14, 6:26 AM, Bill Havanki wrote:
> I've been watching the conversation on the side, but I wanted to mention
> that it seems the focus isn't so much on "mini" clusters anymore. You're
> thinking of programmatic cluster management, whether one node or many. The
> idea of a basic cluster management interface, with MAC as an
> implementation, is promising. A package name of just "cluster" could work.
>
> Carry on :)
>
> Bill H
>
>
> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey <busbey+lists@cloudera.com>wrote:
>
>> If you decide to go the mapred/mapreduce way, you could go with the package
>> name "mini".
>>
>> alternatively, we can do a multi-stage change out
>>
>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>>
>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>>
>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>>
>> Or just go with TestAccumuloCluster as the interface, have
>> MiniAccumuloCluster as the local pseudo distributed implementation, and
>> then call your new one something like YarnAccumuloCluster.
>>
>> In that case we could use the deprecation cycle to move the MAC class out
>> of the public api.
>>
>>
>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <josh.elser@gmail.com> wrote:
>>
>>> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
>>> future cruft?
>>>
>>> Suggestions on the new package and/or class name would be greatly
>>> appreciated over "NewMiniAccumuloC*".
>>>
>>>
>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>>>
>>>> Those who are interested: check out
>>>> https://github.com/joshelser/accumulo/commit/
>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>>>>
>>>>
>>>> tl;dr I could create some real interfaces for the cluster and config,
>>>> which are "hidden" under the covers by the 1.4 and 1.5
>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
>>>> default implementation, gives us the ability to hide "implementation
>>>> details" if wanted, and moves us towards some factory methods instead of
>>>> calling a class directly.
>>>>
>>>> Thoughts?
>>>>
>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>>>>
>>>>> Yes, very much experimental at this point.
>>>>>
>>>>> What I'm most concerned about is having reasonable hooks up front, not
>>>>> trying to make an implementation for inclusion 1.6.0.
>>>>>
>>>>> Regarding additions, the implementations already contains most things
I
>>>>> would want to expose. I haven't come up with anything that would be
>>>>> generally returned through the "API" rather than through this proposed
>>>>> implementation (e.g. YARN connection information)
>>>>>
>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>>>>
>>>>>> What you are trying to do sounds interesting.  It also sounds
>>>>>> experimental
>>>>>> and in the early stages.   Is there anything specific you think
>>>>>> should be
>>>>>> done for 1.6.0 w/ regards to MAC API?
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <josh.elser@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>>>>
>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <josh.elser@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Can you give an example of what you are thinking of?
I don't
>>>>>>>>> understand
>>>>>>>>>
>>>>>>>>>> you
>>>>>>>>>> viewpoint either
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing
harness, is
>>>>>>>>> that it
>>>>>>>>> doesn't adequately exercise multi-node implementations.
You can run
>>>>>>>>> multiple tservers, but they are all on the same host
which limits
>> the
>>>>>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>>>>>
>>>>>>>>> Multi-node deployments are capable using something like
Mesos or
>>>>>>>>> Yarn.
>>>>>>>>> Given that there is already functioning support to deploy
Accumulo
>> on
>>>>>>>>> Yarn,
>>>>>>>>> this was my goal.
>>>>>>>>>
>>>>>>>>> My goal is to be able to have the ability to run all
of our
>>>>>>>>> AbstractMacIT
>>>>>>>>> implementations against "real" hardware without changing
a single
>>>>>>>>> line of
>>>>>>>>> test code (ok - maybe a line or two to do injection of
the MAC
>>>>>>>>> implementation). The point is, I believe there could
be a huge
>>>>>>>>> testing
>>>>>>>>> gain
>>>>>>>>> from being able to write tests which leverage yarn, have
the same
>>>>>>>>> programmatic configuration API from MAC, and provide
near "real"
>>>>>>>>> Accumulo
>>>>>>>>> semantics.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Ok so you want to MAC to be an interface so that you
can provide a
>>>>>>>> completely different implementation?
>>>>>>>>
>>>>>>>>
>>>>>>>>   Correct. Some things would serve well in a common abstract
base
>> (e.g.
>>>>>>> numTservers, siteXml configuration), but all the nonsense about
>>>>>>> creating
>>>>>>> directory structures and managing Processes is implementation
>> specific.
>>>>>>>
>>>>>>> Perhaps I could create a new interface that the current
>> implementation
>>>>>>> implements which still provides the same semantics from 1.4 and
1.5.
>>>>>>> Let me
>>>>>>> see if I can mock up what I'm thinking -- that will probably
be
>>>>>>> easier than
>>>>>>> me trying to write it out.
>>>>>>>
>>>>>>>
>>>>>>
>>
>
>
>

Mime
View raw message