accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <>
Subject Re: Growing project involvement
Date Thu, 15 Jan 2015 20:21:43 GMT
I'd love to see in-depth examples of the cell-level security. How
about an example of using Accumulo for HIPPA? Is anyone using Accumulo
for Genetics?

On Thu, Jan 15, 2015 at 3:13 PM, Josh Elser <> wrote:
> Another anonymous response:
> <quote>
> I had never looked at the accumulo front page until this morning.  I think
> it does ok with "who are you?", but should to better at "*why* are you?". it
> indirectly mentions the security model and iterators, but I think it should
> make those front and center.  and ingest performance is huge.
> I don't know how aggressive you want to get, but I think you really ought to
> directly compare to hbase and cassandra, on various dimensions.
> What market segments would you love accumulo to get in to? (health care?
> ...). If I were a developer looking to spend my hobby time, the front page
> might lead me to check out the other projects, and maybe not come back (and
> a google of "hbase vs" lists a number of comparisons that did not even
> include accumulo).
> In general, I think getting more users would get more developers:
>   - I think that points to the marketing side of things
>   - NiFi is doing a stunningly good job with blog posts about low-pain setup
> and examples, right out of the gate
> Iterators are terrifying to implement/deploy:
>   - they are clearly a novel paradigm when reading the paper/docs, but
> implementing and deploying a complex new iterator, or even an update to an
> iterator that's been working for a long time, on a large cloud, always makes
> me hold my breath until i'm about to pass out
>   - Even after i've added every possible unit test I can think of, I still
> assume that I will see a storm of crashing tservers when I push out to a
> large cloud.
>   - Some sort of systematic safety harness for vetting a new iterator or
> combination of iterators would be great
>   - I think it's mostly scary because we don't really have a small live
> playground in to which we can copy data and make mistakes. Maybe the
> solution is to create the playground (with real, non-cherry-picked data),
> and be able to make mistakes that don't cost days to undo but that takes a
> good deal of work, and tools could be written to support that.
> <quote>
> Some personal thoughts:
> Good points about being more assertive WRT marketing. I think it's fair to
> say that we get "walked" often because we're not aggressive enough in
> stating that Accumulo is a player.
> We should make an iterator fuzzing framework. We know what the system does
> that is unexpected and can likely codify that in a test environment. It
> would take a little bit of effort to implement well, but I do think it's
> feasible. Clone()'ing a table is one option if you have real data in a real
> environment -- that will at least prevent you from destroying existing data,
> but it doesn't protect you against tanking your Accumulo instance with some
> thread/memory leak :)
> Josh Elser wrote:
>> I meant to send this out closer to the new year (to ride on the new year
>> resolution stereotype), but I slacked. Forgive me.
>> As should be aware by those paying attention, we have had very little
>> growth within the project over the past 6-9 months. We've had our normal
>> spattering of contributions, a few from some repeat people, but I don't
>> think we've grown as much as we could.
>> I wanted to see if anyone has any suggestions on what we could try to do
>> better in the coming year to help more people get involved with the
>> project. I don't want this to turn into a "we do X wrong" discussion, so
>> please try to stay positive and include suggestion(s) for every problem
>> presented when possible.
>> Also, everyone should feel welcome to participate in the discussion
>> here. If you fall into the "bucket" described, I'd love to hear from
>> you. If anyone doesn't want to publicly respond, please feel free to
>> email me privately and I'll anonymously post to the list on your behalf.
>> Some ideas to start off discussion:
>> * Help reduce barrier to entry for new developers
>> - Ensure imple/easy-to-process instructions for getting and building
>> code in common environments
>> - Instructions on running tests and reporting issues
>> * More high-level examples
>> - Maybe we start too deep in distributed-systems land and we scare away
>> devs who think they "don't know enough to help"
>> - Recording "newbie" tickets and providing adequate information for
>> anyone to come along and try to take it on
>> - Encourage/help/promote "concrete" ideas/code in the project. Something
>> that is more tangible for devs to wrap their head around (also can help
>> with adoption from new users)
>> * Better documentation and "marketing"
>> - We do "ok" with the occasional blog post, and the user manual is
>> usually thorough, but we can obviously do better.
>> - Can we create more "literature" to encourage more users and devs to
>> get involved, trying to lower the barrier to entry?
>> Thanks all.

View raw message