accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Growing project involvement
Date Wed, 14 Jan 2015 00:47:52 GMT
Let's please try to stay on topic guys... this is really important.

David Medinets wrote:
> explains D4M in detail and
> provides a Java-based ingest example.
> On Tue, Jan 13, 2015 at 6:40 PM, Kepner, Jeremy - 0553 - MITLL
> <>  wrote:
>> Hi Joe,
>>   Thanks for the feedback.
>> This is really great!
>> Did your folks look at
>> It might address a couple of the issues.
>> Regards.  -Jeremy
>> On Jan 13, 2015, at 5:49 PM, Joe Stein<>  wrote:
>>> I have had a lot of feedback in the market place on Accumulo. This feedback
>>> was 100% from folks that didn't have Accumulo as a requirement to run and
>>> feel that it is very relevant to broader adoption. All of the below
>>> comments are a combination of my own opinions and what I have heard from
>>> others in the market in discussion about Accumulo.
>>> 1) Iterators are awesome from a software architecture perspective. From a
>>> development perspective if you have worked with them you have an experience
>>> or two to share on how to improve them. Anything that can be done to
>>> improve this experience for developers will be welcomed for new and
>>> existing users.
>>> 2) Lots of little cosmetic surface things in lots of places and attentions
>>> to details. e.g. the branch is not the
>>> latest and even the latest branch (master?) README isn't really welcoming
>>> or appealing from a "my first time visiting the project" perspective. For
>>> new users you only get 1 impression for a first impression, this is
>>> important under the "technical marketing umbrella".  Some Vagrant and/or
>>> Docker will make getting up and running quickly fantastic for folks that
>>> have to (or want to) interact with Accumulo.
>>> 3) The project should/could have more out of the box integrations and
>>> support from the core project release cycles. e.g. Accumulo Framework for
>>> Apache Mesos. I don't think the drive for this (Mesos support) is lacking
>>> but having spoken to other Accumulo users there is no clear path how folks
>>> can help to make this happen. The eco system just isn't big enough for
>>> these type of projects to exist successfully outside the core project on
>>> some github url.
>>> 4) Some eco system page or place where "all things accumulo" can be sought
>>> after... planet accumulo, something like that (no reason to reinvent this
>>> wheel).  This is probably a combined issue of lack of aggregatable things
>>> (which we should try to improve) and the ability to have them seen in one
>>> place.  One of the coolest things I have seen Accumulo release since
>>> following the project has been
>>> but haven't seen anything else since this posting. Is it that the
>>> information isn't bubbling up or that people aren't posting more about cool
>>> things in place? Are people even using it?
>>> 5) Not; just; Java; please; =>  how about more Scala (maybe Iterator
>>> examples) and/or Go with some ProtoBuf interface? from an implementation
>>> perspective Java; just; kills; things; in; their; tracks; ! and Thrift has
>>> a way to-do that too...
>>> 6) Operations is almost an opaque box. Getting something up and running for
>>> development is important but so is pushing it into production and
>>> sustaining it at scale. The more information about how this is done and
>>> where things work and do not work will be a  *HUGE* driver for the
>>> community (IMHO). Again, maybe all this stuff is out there and #4 is really
>>> how to solve this for folks to not spend their nights and weekends googling.
>>> 7) Apache Spark support. While arguably this goes under #3 I think it has
>>> to be called out as another (better?) option for MapReduce. It is really
>>> easy to get Spark to use AccumuloInputFormat which is wonderful and a
>>> fantastic opportunity for making Accumulo shine with Spark. A few samples
>>> people can run with Spark and Accumulo together that do something more than
>>> word count will go a long way to attracting an audience too.
>>> 8) More ways to highlight the work loads that Accumulo was built for and
>>> what it does now and how it is not about website or social or ads is
>>> important to organizations in verticals that care differently about their
>>> data.
>>> 9) Better call out features and highlight them with more examples
>>> explicitly. I might be repeating myself at this point but wanted to bring
>>> up "Tracing" as another good example of a REALLY cool feature that folks
>>> when they see it don't entirely understand what/how todo with it. Google
>>> for "accumulo trace" or even going through the documentation it is
>>> impossible to figure out how to use it and make it work without late nights
>>> and tender loving care.
>>> None of these things are easy and are very demanding for open source
>>> projects and communities. I think this is a great discussion and hope to
>>> continue to contribute moving forward.
>>> /*******************************************
>>> Joe Stein
>>> Founder, Principal Consultant
>>> Big Data Open Source Security LLC
>>> Twitter: @allthingshadoop<>
>>> ********************************************/
>>> On Tue, Jan 13, 2015 at 4:37 PM, Keith Turner<>  wrote:
>>>> I think a minimal getting started guide is needed on the web site.
>>>> Something that will take a user from download to running on a cluster in
>>>> few steps as possible.  This info is buried in the README, but there is too
>>>> much other stuff in the readme.
>>>> On Tue, Jan 13, 2015 at 4:09 PM, Josh Elser<> 
>>>>> I meant to send this out closer to the new year (to ride on the new year
>>>>> resolution stereotype), but I slacked. Forgive me.
>>>>> As should be aware by those paying attention, we have had very little
>>>>> growth within the project over the past 6-9 months. We've had our normal
>>>>> spattering of contributions, a few from some repeat people, but I don't
>>>>> think we've grown as much as we could.
>>>>> I wanted to see if anyone has any suggestions on what we could try to
>>>>> better in the coming year to help more people get involved with the
>>>>> project. I don't want this to turn into a "we do X wrong" discussion,
>>>>> please try to stay positive and include suggestion(s) for every problem
>>>>> presented when possible.
>>>>> Also, everyone should feel welcome to participate in the discussion here.
>>>>> If you fall into the "bucket" described, I'd love to hear from you. If
>>>>> anyone doesn't want to publicly respond, please feel free to email me
>>>>> privately and I'll anonymously post to the list on your behalf.
>>>>> Some ideas to start off discussion:
>>>>> * Help reduce barrier to entry for new developers
>>>>> - Ensure imple/easy-to-process instructions for getting and building
>>>>> code in common environments
>>>>> - Instructions on running tests and reporting issues
>>>>> * More high-level examples
>>>>> - Maybe we start too deep in distributed-systems land and we scare away
>>>>> devs who think they "don't know enough to help"
>>>>> - Recording "newbie" tickets and providing adequate information for
>>>>> anyone to come along and try to take it on
>>>>> - Encourage/help/promote "concrete" ideas/code in the project.
>>>> Something
>>>>> that is more tangible for devs to wrap their head around (also can help
>>>>> with adoption from new users)
>>>>> * Better documentation and "marketing"
>>>>> - We do "ok" with the occasional blog post, and the user manual is
>>>>> usually thorough, but we can obviously do better.
>>>>> - Can we create more "literature" to encourage more users and devs to
>>>>> get involved, trying to lower the barrier to entry?
>>>>> Thanks all.

View raw message