accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: 1.7 release timeline
Date Wed, 08 Oct 2014 05:13:44 GMT
Forgot one:

*Drop Hadoop 1 support*
   - We would no longer care about maintaining Hadoop 1 APIs (get rid of 
crappy reflection)
   - 2.2.0 (Hadoop 2 "stable") came out just under 1 year ago
   - Can be done for 1.7 or reconsidered for 2.0

Josh Elser wrote:
> Some more information on the subject. A few of us got together to
> co-work today and had an informal discussion on our individual interests
> for 1.7. Summary incoming:
>
> *Monitor re-write*
> - I was pushing this one, I think the monitor still has merit despite
> the goal of the desire of other to just integrate with external systems
> - I have some code in place, but still needs more work.
> - Is a unified/stable "metrics" API necessary for integration w/
> external tools? (or is JMX enough?)
> - An API would probably be a more usable interface than JMX
> - Such an API should be stateless (no log aggregation nor statistics
> over time)
> - Monitor still has uses for standalone/small deployments
> - If still being used, MVC approach would ease testing and addition of
> new data and views
> - Not necessary to hold up 1.7.0 from happening
>
> *Revisit performance*
> - Eric mentioned that he wants to spend some time running some Accumulo
> benchmarks, specifically YCSB.
> - Lots of related topics were mentioned that might be relevant
> * Other HDFS block cache implementations (HBase has lots of nice
> benchmarks, could learn from them)
> * A WIP patch for metadata updates have some promise (ACCUMULO-2889)
> * Collapse iterator stack (ACCUMULO-3079)
> * Possible improvements to Scanner for single-batch cases (reduce a few
> RPCs to one RPC)
> - Actual changes made likely to be found via investigation
> - Changing default conf values where relevant also mentioned
>
> *Distributed Tracing*
> - Billie has been spending some time working w/ some people on replacing
> Cloudtrace with HTrace
> - Mentioned that HTrace shares a remarkable amount of similarity with
> our existing tracing library
> - Upstream efforts in Hadoop-3 to integrate htrace to DN/NN calls
> - Some consideration given to replace traceserver with zipkin however
> not required for the first implementation
>
> *Decouple MiniAccumuloCluster from ITs*
> - Another one I've started working on
> - ITs are really great, we have a lot for really good cases
> - Running them against a real instance in infeasible right now
> - Would be good to express as many as possible in terms of only using
> Instance+Connector
> - Christopher mentioned possible benefit outside of tests to using the
> accumulo-maven-plugin as the "shim" between a real instance and a
> MiniAccumuloCluster
> - Some tests are written explicitly for MAC and must be ignored or run
> against a MAC when a real instance is available.
>
> *Upgrade test script*
> - Keith mentioned there's some code from John McNamee that might help
> testing upgrade paths
>
> *Hadoop Metrics2*
> - Metrics2 is the current library in use by Hadoop
> - Integration gives us a lot more flexibility, notably good integration
> with Ganglia provided (ACCUMULO-1817)
> - No one expressed interest in working on this directly (potential to slip)
>
> *Deprecate MockAccumulo?*
> - Talked about this for 1.6, decided against
> - It's now 1.7. Is it time?
> - Remember, deprecate != removal
>
> There are some outstanding things we need to investigate more:
> - Is improved JMX or metrics2 impl sufficient for integration with
> external monitoring tools? (considerations: nagios, ganglia, statsd,
> collectd, carbon, riemann... others?)
> - BatchWriter has some weird cases around error handling. Is intended
> that it survives failures, but that's very much not the case. Should
> probably be fixed around a major release, but need to figure out how
> exactly to fix it (needs someone to get behind it)
>
> If people want to continue discussion on these, let's break off
> individual topics into their own thread for clarity (and my sanity).
>
> Also, anyone have a desire to be "release manager"?
>
> - Josh
>
> Josh Elser wrote:
>> Thanks, John.
>>
>> I was thinking about trying to gun for January time-frame for a release.
>> I'd love to say before 2014 is over, but that probably just won't happen
>> for a major release with the holidays.
>>
>> For 1.7 right now, I see the following "bigger" items (correct me where
>> I'm wrong):
>>
>> * Replication (done)
>> * Upgrade rules/guarantees (proposed)
>> * Replace cloudtrace (in-progress)
>> * Rewrite monitor, include REST service (in-progress)
>> * Drop Hadoop 1 support (proposed)
>> * Decouple MiniAccumulo from ITs (in-progress)
>> * Other minicluster types: in-process, shim to real instance
>> (in-progress)
>> * Support Hadoop metrics2 (proposed)
>> * A few WAL/metadata related performance improvements (in-progress)
>>
>> Also, would be good to check the In-Progress state issues on JIRA. What
>> do people think?
>>
>> John Vines wrote:
>>> Moving this to it's own thread...
>>>
>>> On Mon, Oct 6, 2014 at 5:54 PM, Mike Drob<madrob@cloudera.com> wrote:
>>>
>>>> Related: Do we have a release timeline for 1.7?
>>>>

Mime
View raw message