accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Growing project involvement
Date Thu, 15 Jan 2015 20:13:57 GMT
Another anonymous response:

I had never looked at the accumulo front page until this morning.  I 
think it does ok with "who are you?", but should to better at "*why* are 
you?". it indirectly mentions the security model and iterators, but I 
think it should make those front and center.  and ingest performance is 

I don't know how aggressive you want to get, but I think you really 
ought to directly compare to hbase and cassandra, on various dimensions.

What market segments would you love accumulo to get in to? (health care? 
...). If I were a developer looking to spend my hobby time, the front 
page might lead me to check out the other projects, and maybe not come 
back (and a google of "hbase vs" lists a number of comparisons that did 
not even include accumulo).

In general, I think getting more users would get more developers:
   - I think that points to the marketing side of things
   - NiFi is doing a stunningly good job with blog posts about low-pain 
setup and examples, right out of the gate

Iterators are terrifying to implement/deploy:

   - they are clearly a novel paradigm when reading the paper/docs, but 
implementing and deploying a complex new iterator, or even an update to 
an iterator that's been working for a long time, on a large cloud, 
always makes me hold my breath until i'm about to pass out
   - Even after i've added every possible unit test I can think of, I 
still assume that I will see a storm of crashing tservers when I push 
out to a large cloud.
   - Some sort of systematic safety harness for vetting a new iterator 
or combination of iterators would be great
   - I think it's mostly scary because we don't really have a small live 
playground in to which we can copy data and make mistakes. Maybe the 
solution is to create the playground (with real, non-cherry-picked 
data), and be able to make mistakes that don't cost days to undo but 
that takes a good deal of work, and tools could be written to support that.

Some personal thoughts:

Good points about being more assertive WRT marketing. I think it's fair 
to say that we get "walked" often because we're not aggressive enough in 
stating that Accumulo is a player.

We should make an iterator fuzzing framework. We know what the system 
does that is unexpected and can likely codify that in a test 
environment. It would take a little bit of effort to implement well, but 
I do think it's feasible. Clone()'ing a table is one option if you have 
real data in a real environment -- that will at least prevent you from 
destroying existing data, but it doesn't protect you against tanking 
your Accumulo instance with some thread/memory leak :)

Josh Elser wrote:
> I meant to send this out closer to the new year (to ride on the new year
> resolution stereotype), but I slacked. Forgive me.
> As should be aware by those paying attention, we have had very little
> growth within the project over the past 6-9 months. We've had our normal
> spattering of contributions, a few from some repeat people, but I don't
> think we've grown as much as we could.
> I wanted to see if anyone has any suggestions on what we could try to do
> better in the coming year to help more people get involved with the
> project. I don't want this to turn into a "we do X wrong" discussion, so
> please try to stay positive and include suggestion(s) for every problem
> presented when possible.
> Also, everyone should feel welcome to participate in the discussion
> here. If you fall into the "bucket" described, I'd love to hear from
> you. If anyone doesn't want to publicly respond, please feel free to
> email me privately and I'll anonymously post to the list on your behalf.
> Some ideas to start off discussion:
> * Help reduce barrier to entry for new developers
> - Ensure imple/easy-to-process instructions for getting and building
> code in common environments
> - Instructions on running tests and reporting issues
> * More high-level examples
> - Maybe we start too deep in distributed-systems land and we scare away
> devs who think they "don't know enough to help"
> - Recording "newbie" tickets and providing adequate information for
> anyone to come along and try to take it on
> - Encourage/help/promote "concrete" ideas/code in the project. Something
> that is more tangible for devs to wrap their head around (also can help
> with adoption from new users)
> * Better documentation and "marketing"
> - We do "ok" with the occasional blog post, and the user manual is
> usually thorough, but we can obviously do better.
> - Can we create more "literature" to encourage more users and devs to
> get involved, trying to lower the barrier to entry?
> Thanks all.

View raw message