accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: Growing project involvement
Date Tue, 13 Jan 2015 22:49:23 GMT
I have had a lot of feedback in the market place on Accumulo. This feedback
was 100% from folks that didn't have Accumulo as a requirement to run and
feel that it is very relevant to broader adoption. All of the below
comments are a combination of my own opinions and what I have heard from
others in the market in discussion about Accumulo.

1) Iterators are awesome from a software architecture perspective. From a
development perspective if you have worked with them you have an experience
or two to share on how to improve them. Anything that can be done to
improve this experience for developers will be welcomed for new and
existing users.

2) Lots of little cosmetic surface things in lots of places and attentions
to details. e.g. https://github.com/apache/accumulo the branch is not the
latest and even the latest branch (master?) README isn't really welcoming
or appealing from a "my first time visiting the project" perspective. For
new users you only get 1 impression for a first impression, this is
important under the "technical marketing umbrella".  Some Vagrant and/or
Docker will make getting up and running quickly fantastic for folks that
have to (or want to) interact with Accumulo.

3) The project should/could have more out of the box integrations and
support from the core project release cycles. e.g. Accumulo Framework for
Apache Mesos. I don't think the drive for this (Mesos support) is lacking
but having spoken to other Accumulo users there is no clear path how folks
can help to make this happen. The eco system just isn't big enough for
these type of projects to exist successfully outside the core project on
some github url.

4) Some eco system page or place where "all things accumulo" can be sought
after... planet accumulo, something like that (no reason to reinvent this
wheel).  This is probably a combined issue of lack of aggregatable things
(which we should try to improve) and the ability to have them seen in one
place.  One of the coolest things I have seen Accumulo release since
following the project has been
https://blogs.apache.org/accumulo/entry/scaling_accumulo_with_multi_volume
but haven't seen anything else since this posting. Is it that the
information isn't bubbling up or that people aren't posting more about cool
things in place? Are people even using it?

5) Not; just; Java; please; => how about more Scala (maybe Iterator
examples) and/or Go with some ProtoBuf interface? from an implementation
perspective Java; just; kills; things; in; their; tracks; ! and Thrift has
a way to-do that too...

6) Operations is almost an opaque box. Getting something up and running for
development is important but so is pushing it into production and
sustaining it at scale. The more information about how this is done and
where things work and do not work will be a  *HUGE* driver for the
community (IMHO). Again, maybe all this stuff is out there and #4 is really
how to solve this for folks to not spend their nights and weekends googling.

7) Apache Spark support. While arguably this goes under #3 I think it has
to be called out as another (better?) option for MapReduce. It is really
easy to get Spark to use AccumuloInputFormat which is wonderful and a
fantastic opportunity for making Accumulo shine with Spark. A few samples
people can run with Spark and Accumulo together that do something more than
word count will go a long way to attracting an audience too.

8) More ways to highlight the work loads that Accumulo was built for and
what it does now and how it is not about website or social or ads is
important to organizations in verticals that care differently about their
data.

9) Better call out features and highlight them with more examples
explicitly. I might be repeating myself at this point but wanted to bring
up "Tracing" as another good example of a REALLY cool feature that folks
when they see it don't entirely understand what/how todo with it. Google
for "accumulo trace" or even going through the documentation it is
impossible to figure out how to use it and make it work without late nights
and tender loving care.

None of these things are easy and are very demanding for open source
projects and communities. I think this is a great discussion and hope to
continue to contribute moving forward.

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/

On Tue, Jan 13, 2015 at 4:37 PM, Keith Turner <keith@deenlo.com> wrote:

> I think a minimal getting started guide is needed on the web site.
> Something that will take a user from download to running on a cluster in as
> few steps as possible.  This info is buried in the README, but there is too
> much other stuff in the readme.
>
> On Tue, Jan 13, 2015 at 4:09 PM, Josh Elser <josh.elser@gmail.com> wrote:
>
> > I meant to send this out closer to the new year (to ride on the new year
> > resolution stereotype), but I slacked. Forgive me.
> >
> > As should be aware by those paying attention, we have had very little
> > growth within the project over the past 6-9 months. We've had our normal
> > spattering of contributions, a few from some repeat people, but I don't
> > think we've grown as much as we could.
> >
> > I wanted to see if anyone has any suggestions on what we could try to do
> > better in the coming year to help more people get involved with the
> > project. I don't want this to turn into a "we do X wrong" discussion, so
> > please try to stay positive and include suggestion(s) for every problem
> > presented when possible.
> >
> > Also, everyone should feel welcome to participate in the discussion here.
> > If you fall into the "bucket" described, I'd love to hear from you. If
> > anyone doesn't want to publicly respond, please feel free to email me
> > privately and I'll anonymously post to the list on your behalf.
> >
> > Some ideas to start off discussion:
> >
> > * Help reduce barrier to entry for new developers
> >   - Ensure imple/easy-to-process instructions for getting and building
> > code in common environments
> >   - Instructions on running tests and reporting issues
> >
> > * More high-level examples
> >   - Maybe we start too deep in distributed-systems land and we scare away
> > devs who think they "don't know enough to help"
> >   - Recording "newbie" tickets and providing adequate information for
> > anyone to come along and try to take it on
> >   - Encourage/help/promote "concrete" ideas/code in the project.
> Something
> > that is more tangible for devs to wrap their head around (also can help
> > with adoption from new users)
> >
> > * Better documentation and "marketing"
> >   - We do "ok" with the occasional blog post, and the user manual is
> > usually thorough, but we can obviously do better.
> >   - Can we create more "literature" to encourage more users and devs to
> > get involved, trying to lower the barrier to entry?
> >
> > Thanks all.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message