accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Whyne <>
Subject Re: accumulo pull request: raccumulo Packaged: 2013-05-09 22:18:20 UTC; pgrim
Date Sun, 27 Oct 2013 22:14:54 GMT
1. I will take care fo the ASF ICLAs from contributors this week. I don't
think that's a problem based on conversations I've had.

2. CCLA I don't see any major hurdles to doing this, I'll set up some
discussions with other company leadership to move this forward. I'll be
advocating for this because I think it's a great idea.

3a. I don't see the contributed code being able to sustain a new
community.  Although there is a demand for the R interface this creates,
there's not many other directions or advancements that could be made that
would be useful to undertake independent of the core accumulo roadmap.

3b. After some discussion on our end, it's inclusion in the contrib where
the pull request places it is optimal for it's current state. It will make
an R interface to accumulo part of the core distribution. This could
potentially increase adoption of accumulo across more of the data science
community, I know it would intrigue us from a corporate standpoint...
that's why it was written.

3c. As a future roadmap for the capability, I suspect that there's a better
way to do what this code accomplishes (sans proxy?) and that the
functionality will get rolled somewhere into the rest of the project in a
better way. That effort would probably A: use large chunks of this code (so
you'd want to include it to ensure licensing and avoid potential conflicts)
and B: make any contrib project obsolete, leading to it's demise. If it's
an independent contrib project or incubating as an apache project a lot of
time would have been wasted. If it's part of the core distribution it
equates to merely moving a directory from the contrib section of the code
base PMC can feel free to remove the parts that are obsolete and much
confusion would be avoided.

4. We're willing to move forward with it in whatever way that makes sense
to all involved. The core developer (Phil Grim) has an interest in
maintaining the capability since we have customers depending on it. I think
the best way would be to keep our github fork updated and then just sending
maintenance pull requests as modifications are made on our end. Of course,
we'd be looking for guidance on how to best license it and
establishing/signing whatever we have to to ensure compliance with ASF


On Sun, Oct 27, 2013 at 2:37 PM, Chris Mattmann <> wrote:

> Hi Josh, and Eric,
> Thanks. As for the IP clearances and stuff, yes, makes total sense. Eric
> and
> anyone that contributed to this patch need and should have ASF ICLAs on
> file.
> It's a simple process, download the ICLA here:
> And submit it for each individual contributor to
> Beyond that it may be nice in this case since it's got corporate stuff
> attached to it to see if Data Tactics is willing to sign an Apache
> Corporate
> Contributor License Agreement (CCLA):
> Not a requirement by any means, but a nice thing. So Eric and others at
> Data Tactics, that's something to consider.
> Once that's taken care, the Accumulo PMC can take on the stewardship of
> the code if someone on that PMC like Josh is willing to work through any
> issues
> that that person (or others on the PMC) have with bringing it on board.
> Also since Apache is a meritocracy, Eric and others, you will be credited
> for the work that you contribute and if you keep doing so over time, and
> creating work for the Accumulo PMC members, your work will likely be
> recognized.
> The Incubator PMC is there as a clearinghouse for new projects and new
> communities to begin their journey towards Apache individual project
> status (TLP) and the Apache way. Eric: do you see this project as an
> entirely
> new community that is complementary to Accumulo? If so, the Incubator would
> be a good route to go. If you see this as part of core Accumulo (or even
> contrib)
> and you can convince someone on the PMC like Josh to help shepherd this
> code
> into their PMC, that's also another route.
> Cheers,
> Chris
> -----Original Message-----
> From: Josh Elser <>
> Reply-To: "" <>
> Date: Sunday, October 27, 2013 11:22 AM
> To: "" <>
> Subject: Re: accumulo pull request: raccumulo Packaged: 2013-05-09
> 22:18:20 UTC; pgrim
> >Thanks again, Eric (and Phil).
> >
> >It's awesome to see this amount of work put in to integrate with R. But,
> >personally, I don't think direct inclusion in Accumulo is the proper
> >place for it.
> >
> >It definitely cannot be directly merged as such: we would need to make
> >sure we have ICLAs from all individuals and a CCLA from Data-Tactics (if
> >memory serves). Essentially, we need to make sure the proper paperwork
> >exists that the ownership is assigned to the ASF (instead of individuals
> >or Data-Tactics as the notices alternate between currently). Also, the
> >ASF has a general process for handling imports of code. [1]
> >
> >It looks like it's missing any documentation on how to use it too, e.g.
> >the user needs to start an instance of the thrift proxy themselves, but
> >that's a little nit-picky on my end :)
> >
> >Given the chatter on ACCUMULO-1804, it seems like it's desired for this
> >to be its own contrib repo as a part of the ASF. The next step here
> >would be for us to contact the ASF incubator to figure out the IP rules
> >and shake out any licensing concerns.
> >
> >Let me know for sure and I can kick off a message to the incubator if
> >this is how you (and Data-Tactics) want to proceed. [2]
> >
> >- Josh
> >
> >[1]
> >[2]
> >
> >
> >On 10/25/13, 12:13 PM, ericwhyne wrote:
> >> GitHub user ericwhyne opened a pull request:
> >>
> >>
> >>
> >>      raccumulo Packaged: 2013-05-09 22:18:20 UTC; pgrim
> >>
> >>      This pull request is in response to this issue:
> >>
> >>
> >>      What this code is:
> >>      Need to be able to support users who utilize RStudio to conduct
> >>analysis of data residing in the Accumulo data space instead of moving
> >>data from one repository to a stand alone system to have the analytic
> >>run in memory. RStudio should be able to make calls directly to the data
> >>space and provide the output within the RStudio interface.
> >>
> >>
> >> You can merge this pull request into a Git repository by running:
> >>
> >>      $ git pull master
> >>
> >> Alternatively you can review and apply these changes as the patch at:
> >>
> >>
> >>
> >> ----
> >> commit 116c045d05074b0e0ccf907e42235f94aa7c1703
> >> Author: Eric Whyne <>
> >> Date:   2013-10-25T16:08:38Z
> >>
> >>      raccumulo Packaged: 2013-05-09 22:18:20 UTC; pgrim
> >>
> >> ----
> >>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message