Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F2AC8177AC for ; Thu, 15 Jan 2015 20:24:24 +0000 (UTC) Received: (qmail 86302 invoked by uid 500); 15 Jan 2015 20:24:26 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 86255 invoked by uid 500); 15 Jan 2015 20:24:26 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 86243 invoked by uid 99); 15 Jan 2015 20:24:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2015 20:24:26 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of david.medinets@gmail.com designates 74.125.82.52 as permitted sender) Received: from [74.125.82.52] (HELO mail-wg0-f52.google.com) (74.125.82.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2015 20:24:00 +0000 Received: by mail-wg0-f52.google.com with SMTP id x12so17029242wgg.11 for ; Thu, 15 Jan 2015 12:21:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=HpiJeLA1+SJhvmwD9xjbEUygUNyMk5+qWFRMg0DMrHc=; b=oPSiH1yoT5gxFEW/E24OlgZIan+WZUICkJ5bDBhYA0ug+yn8ZhdZs43QTdxSt0G9sb dHi5Sut4DkaKpqymrtRppQOsFY0xEkXqi4WPyV3IJ23tfphpbaDmGlxLPs2r+hvK/MGk cML54i4HCTi+pUiL2gW25M0LdcP0qKlkQE8edaeHRBRdwwCk8mdT/PZemX5p3i55Sqfl 6W8HnJjbAU1S4m58PMCvDBSF45WbjFB10O260mRFQotj271NzqxNyufxrgv33zN+qCCf dvELvIlOecDe/PwYhsWKnrSPzRvZkf5tHqued7vNdbIyq40LvJMtqjJCJuaVmCkuoZoE cGxw== MIME-Version: 1.0 X-Received: by 10.180.12.75 with SMTP id w11mr22742367wib.9.1421353303906; Thu, 15 Jan 2015 12:21:43 -0800 (PST) Received: by 10.194.184.6 with HTTP; Thu, 15 Jan 2015 12:21:43 -0800 (PST) In-Reply-To: <54B81F85.90306@gmail.com> References: <54B58981.406@gmail.com> <54B81F85.90306@gmail.com> Date: Thu, 15 Jan 2015 15:21:43 -0500 Message-ID: Subject: Re: Growing project involvement From: David Medinets To: accumulo-dev Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org I'd love to see in-depth examples of the cell-level security. How about an example of using Accumulo for HIPPA? Is anyone using Accumulo for Genetics? On Thu, Jan 15, 2015 at 3:13 PM, Josh Elser wrote: > Another anonymous response: > > > I had never looked at the accumulo front page until this morning. I think > it does ok with "who are you?", but should to better at "*why* are you?". it > indirectly mentions the security model and iterators, but I think it should > make those front and center. and ingest performance is huge. > > I don't know how aggressive you want to get, but I think you really ought to > directly compare to hbase and cassandra, on various dimensions. > > What market segments would you love accumulo to get in to? (health care? > ...). If I were a developer looking to spend my hobby time, the front page > might lead me to check out the other projects, and maybe not come back (and > a google of "hbase vs" lists a number of comparisons that did not even > include accumulo). > > In general, I think getting more users would get more developers: > - I think that points to the marketing side of things > - NiFi is doing a stunningly good job with blog posts about low-pain setup > and examples, right out of the gate > > Iterators are terrifying to implement/deploy: > > - they are clearly a novel paradigm when reading the paper/docs, but > implementing and deploying a complex new iterator, or even an update to an > iterator that's been working for a long time, on a large cloud, always makes > me hold my breath until i'm about to pass out > - Even after i've added every possible unit test I can think of, I still > assume that I will see a storm of crashing tservers when I push out to a > large cloud. > - Some sort of systematic safety harness for vetting a new iterator or > combination of iterators would be great > - I think it's mostly scary because we don't really have a small live > playground in to which we can copy data and make mistakes. Maybe the > solution is to create the playground (with real, non-cherry-picked data), > and be able to make mistakes that don't cost days to undo but that takes a > good deal of work, and tools could be written to support that. > > > Some personal thoughts: > > Good points about being more assertive WRT marketing. I think it's fair to > say that we get "walked" often because we're not aggressive enough in > stating that Accumulo is a player. > > We should make an iterator fuzzing framework. We know what the system does > that is unexpected and can likely codify that in a test environment. It > would take a little bit of effort to implement well, but I do think it's > feasible. Clone()'ing a table is one option if you have real data in a real > environment -- that will at least prevent you from destroying existing data, > but it doesn't protect you against tanking your Accumulo instance with some > thread/memory leak :) > > Josh Elser wrote: >> >> I meant to send this out closer to the new year (to ride on the new year >> resolution stereotype), but I slacked. Forgive me. >> >> As should be aware by those paying attention, we have had very little >> growth within the project over the past 6-9 months. We've had our normal >> spattering of contributions, a few from some repeat people, but I don't >> think we've grown as much as we could. >> >> I wanted to see if anyone has any suggestions on what we could try to do >> better in the coming year to help more people get involved with the >> project. I don't want this to turn into a "we do X wrong" discussion, so >> please try to stay positive and include suggestion(s) for every problem >> presented when possible. >> >> Also, everyone should feel welcome to participate in the discussion >> here. If you fall into the "bucket" described, I'd love to hear from >> you. If anyone doesn't want to publicly respond, please feel free to >> email me privately and I'll anonymously post to the list on your behalf. >> >> Some ideas to start off discussion: >> >> * Help reduce barrier to entry for new developers >> - Ensure imple/easy-to-process instructions for getting and building >> code in common environments >> - Instructions on running tests and reporting issues >> >> * More high-level examples >> - Maybe we start too deep in distributed-systems land and we scare away >> devs who think they "don't know enough to help" >> - Recording "newbie" tickets and providing adequate information for >> anyone to come along and try to take it on >> - Encourage/help/promote "concrete" ideas/code in the project. Something >> that is more tangible for devs to wrap their head around (also can help >> with adoption from new users) >> >> * Better documentation and "marketing" >> - We do "ok" with the occasional blog post, and the user manual is >> usually thorough, but we can obviously do better. >> - Can we create more "literature" to encourage more users and devs to >> get involved, trying to lower the barrier to entry? >> >> Thanks all.