Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 71DD518CE7 for ; Wed, 8 Jul 2015 22:44:15 +0000 (UTC) Received: (qmail 63624 invoked by uid 500); 8 Jul 2015 22:44:15 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 63584 invoked by uid 500); 8 Jul 2015 22:44:15 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 63573 invoked by uid 99); 8 Jul 2015 22:44:15 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jul 2015 22:44:15 +0000 Received: from mail-yk0-f170.google.com (mail-yk0-f170.google.com [209.85.160.170]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id D56A61A046D for ; Wed, 8 Jul 2015 22:44:14 +0000 (UTC) Received: by ykcp133 with SMTP id p133so34368376ykc.1 for ; Wed, 08 Jul 2015 15:44:12 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.13.198.2 with SMTP id i2mr14316206ywd.123.1436395452942; Wed, 08 Jul 2015 15:44:12 -0700 (PDT) Received: by 10.129.24.211 with HTTP; Wed, 8 Jul 2015 15:44:12 -0700 (PDT) In-Reply-To: <559D7D65.3050500@gmail.com> References: <559ADF69.9010901@gmail.com> <559D7D65.3050500@gmail.com> Date: Wed, 8 Jul 2015 18:44:12 -0400 Message-ID: Subject: Re: Post 1.5.3 and 1.6.3 From: Christopher To: Accumulo Dev List Content-Type: text/plain; charset=UTF-8 I completely agree with more abstracting separable funtionality where it makes sense. I think RFile makes sense. Maybe separating it out would help alleviate some problems I've seen with it being too tightly coupled with Hadoop config and Accumulo config. And, maybe it'd help with its API... right now, working with RFiles is a nightmare. Even just basic read/write is confusing. As for 2.0... I'm a bit scared of it myself. It keeps lagging in priority for me, and isn't going very far very quickly. I have tried to keep it rebase'd onto the latest master, but it's sometimes difficult to keep that up. One thing I have kept on top of, though, is dropping deprecated stuff in 2.0, so it's not burdened with old APIs and stuff we intend to remove. Everything else in that branch is either unfinished, or still a rough cut. I would like to start maintaining a current 2.0 branch, in our shared git in ASF-land, which is kept up-to-date, drops deprecated stuffs, and where I can start merging in each feature we complete in the API, as we go. My main concern with that is merge strategy... I don't want master tainted with merges from 1.7 -> 2.0 -> master. That'd be bad... and would cause huge problems for us. (1.7->master->2.0 is generally fine, though). That's one reason I previously suggested ceasing use of master as "current development branch", and explicitly make a "1.8" branch for 1.8 devel. (because then, 1.7->1.8->2.0 would still make sense). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Jul 8, 2015 at 3:43 PM, Josh Elser wrote: > Some thoughts myself... > > John Vines brought up to me privately the topic of separating out the RFile > code from core. This started making me think about making this clear for > other components like FATE and RandomWalk. These all have some level of > separation, but they often get other things dropped into the same bucket > (e.g. ZK-retry code in FATE, Accumulo implementation classes in RandomWalk). > Maybe there are more things we could do. > > It would be nice to start trying to pull out these frameworks/sub-projects > into discrete packages. I think it would help us with testing and proper > separation of logic. Long term, maybe other projects would see the value and > consider using/adopting them and grow into their own separately-versioned > artifacts. It would be nice to start these efforts now to eventually reap > the benefits. > > > 2.0 and the new client API is a little scary now that we get another tick > closer to it. I know it's been Christopher's brain-child so far (which is > fine -- not meant to be taken in a negative context), but, if we really do > want to adopt it, we should make a concerted effort to start integrating and > reviewing it. Given how far away this seems, 1.8 and 1.9 could happen (or > client API just gets targeted for a 3.0 -- numbers are just numbers). > > > We have some decent script improvements for 1.8 already in (PID files > _finally_). Would be nice to clean up the rest of the scripts too (notably > the stop scripts need some love). > > > Some other back-burner thoughts: better client API metrics, more server-side > tracing instrumentation, Adam's iterator-stack collapsing perf ticket, keep > tabs on HDFS tracing impl, keep tabs on HTrace's GUI work, finish the > Accumulo monitor rewrite (aka REST server + servlet3). > > - Josh > > > Josh Elser wrote: >> >> Thanks to the efforts spearheaded by Christopher and verified by >> everyone else, we now have 1.5.3 and 1.6.3 releases! >> >> To keep the ball rolling, what's next? High level questions that come to >> mind... >> >> * When do we do 1.7.1 and/or 1.8.0? >> * What bug-fixes do we have outstanding for 1.7.1? >> * What other minor improvements do people want for 1.8.0? >> * Where does 2.0.0 stand? Should we make a bigger effort to getting the >> new client API stuff Christopher had started into Apache? >> >> Feel free to brainstorm here and/or on JIRA (tagging relevant issues to >> the desired fixVersion) >> >> - Josh