hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akira AJISAKA <ajisa...@oss.nttdata.co.jp>
Subject Re: Moving to JDK7, JDK8 and new major releases
Date Wed, 25 Jun 2014 22:44:03 GMT
+1 (non-binding) for 2.5 to be the last release to ensure JDK6.

 >>> My higher-level goal though is to avoid going through this same pain
 >>> again when JDK7 goes EOL. I'd like to do a JDK8-based release
 >>> before then for this reason. This is why I suggested skipping an
 >>> intermediate 2.x+JDK7 release and leapfrogging to 3.0+JDK8.

I'm thinking skipping an intermediate release and leapfrogging to 3.0 
makes it difficult to maintain branch-2. It's only about a half year 
from 2.2 GA, so we should maintain branch-2 and create bug-fix releases 
for long-term even if 3.0+JDK8 is released.


(2014/06/24 17:56), Steve Loughran wrote:
> +1, though I think 2.5 may be premature if we want to send a warning note
> "last ever". That's an issue for followon "when in branch 2".
> Guava and protobuf.jar are two things we have to leave alone, with the
> first being unfortunate, but their attitude to updates is pretty dramatic.
> The latter? We all know how traumatic that can be.
> -Steve
> On 24 June 2014 16:44, Alejandro Abdelnur <tucu@cloudera.com> wrote:
>> After reading this thread and thinking a bit about it, I think it should be
>> OK such move up to JDK7 in Hadoop 2 for the following reasons:
>> * Existing Hadoop 2 releases and related projects are running
>>    on JDK7 in production.
>> * Commercial vendors of Hadoop have already done lot of
>>    work to ensure Hadoop on JDK7 works while keeping Hadoop
>>    on JDK6 working.
>> * Different from many of the 3rd party libraries used by Hadoop,
>>    JDK is much stricter on backwards compatibility.
>> IMPORTANT: I take this as an exception and not as a carte blanche for 3rd
>> party dependencies and for moving from JDK7 to JDK8 (though it could OK for
>> the later if we end up in the same state of affairs)
>> Even for Hadoop 2.5, I think we could do the move:
>> * Create the Hadoop 2.5 release branch.
>> * Have one nightly Jenkins job that builds Hadoop 2.5 branch
>>    with JDK6 to ensure not JDK7 language/API  feature creeps
>>    out in Hadoop 2.5. Keep this for all Hadoop 2.5.x releases.
>> * Sanity tests for the Hadoop 2.5.x releases should be done
>>    with JDK7.
>> * Apply Steve’s patch to require JDK7 on trunk and branch-2.
>> * Move all Apache Jenkins jobs to build/test using JDK7.
>> * Starting from Hadoop 2.6 we support JDK7 language/API
>>    features.
>> Effectively what we are ensuring that Hadoop 2.5.x builds and test with
>> JDK6 & JDK7 and that all tests towards the release
>> are done with JDK7.
>> Users can proactively upgrade to JDK7 before upgrading to Hadoop 2.5.x, or
>> if upgrade to Hadoop 2.5.x and they run into any issue because of JDK6
>> (which it would be quite unlikely) they can reactively upgrade to JDK7.
>> Thoughts?
>> On Tue, Jun 24, 2014 at 4:22 PM, Andrew Wang <andrew.wang@cloudera.com>
>> wrote:
>>> Hi all,
>>> On dependencies, we've bumped library versions when we think it's safe
>> and
>>> the APIs in the new version are compatible. Or, it's not leaked to the
>> app
>>> classpath (e.g the JUnit version bump). I think the JIRAs Arun mentioned
>>> fall into one of those categories. Steve can do a better job explaining
>>> this to me, but we haven't bumped things like Jetty or Guava because they
>>> are on the classpath and are not compatible. There is this line in the
>>> compat guidelines:
>>>     - Existing MapReduce, YARN & HDFS applications and frameworks should
>>>     work unmodified within a major release i.e. Apache Hadoop ABI is
>>> supported.
>>> Since Hadoop apps can and do depend on the Hadoop classpath, the
>> classpath
>>> is effectively part of our API. I'm sure there are user apps out there
>> that
>>> will break if we make incompatible changes to the classpath. I haven't
>> read
>>> up on the MR JIRA Arun mentioned, but there MR isn't the only YARN app
>> out
>>> there.
>>> Sticking to the theme of "work unmodified", let's think about the user
>>> effort required to upgrade their JDK. This can be a very expensive task.
>> It
>>> might need approval up and down the org, meaning lots of certification,
>>> testing, and signoff. Considering the amount of user effort involved
>> here,
>>> it really seems like dropping a JDK is something that should only happen
>> in
>>> a major release. Else, there's the potential for nasty surprises in a
>>> supposedly "minor" release.
>>> That said, we are in an unhappy place right now regarding JDK6, and it's
>>> true that almost everyone's moved off of JDK6 at this point. So, I'd be
>>> okay with an intermediate 2.x release that drops JDK6 support (but no
>>> incompatible changes to the classpath like Guava). This is basically
>> free,
>>> and we could start using JDK7 idioms like multi-catch and new NIO stuff
>> in
>>> Hadoop code (a minor draw I guess).
>>> My higher-level goal though is to avoid going through this same pain
>> again
>>> when JDK7 goes EOL. I'd like to do a JDK8-based release before then for
>>> this reason. This is why I suggested skipping an intermediate 2.x+JDK7
>>> release and leapfrogging to 3.0+JDK8. 10 months is really not that far in
>>> the future, and it seems like a better place to focus our efforts. I was
>>> also hoping it'd be realistic to fix our classpath leakage by then, since
>>> then we'd have a nice, tight, future-proofed new major release.
>>> Thanks,
>>> Andrew
>>> On Tue, Jun 24, 2014 at 11:43 AM, Arun C Murthy <acm@hortonworks.com>
>>> wrote:
>>>> Andrew,
>>>>   Thanks for starting this thread. I'll edit the wiki to provide more
>>>> context around rolling-upgrades etc. which, as I pointed out in the
>>>> original thread, are key IMHO.
>>>> On Jun 24, 2014, at 11:17 AM, Andrew Wang <andrew.wang@cloudera.com>
>>>> wrote:
>>>>> https://wiki.apache.org/hadoop/MovingToJdk7and8
>>>>> I think based on our current compatibility guidelines, Proposal A is
>>> the
>>>>> most attractive. We're pretty hamstrung by the requirement to keep
>> the
>>>>> classpath the same, which would be solved by either OSGI or shading
>> our
>>>>> deps (but that's a different discussion).
>>>> I don't see that anywhere in our current compatibility guidelines.
>>>> As you can see from
>> http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html
>>>> we do not have such a policy (pasted here for convenience):
>>>> Java Classpath
>>>> User applications built against Hadoop might add all Hadoop jars
>>>> (including Hadoop's library dependencies) to the application's
>> classpath.
>>>> Adding new dependencies or updating the version of existing
>> dependencies
>>>> may interfere with those in applications' classpaths.
>>>> Policy
>>>> Currently, there is NO policy on when Hadoop's dependencies can change.
>>>> Furthermore, we have *already* changed our classpath in hadoop-2.x.
>>> Again,
>>>> as I pointed out in the previous thread, here is the precedent:
>>>> On Jun 21, 2014, at 5:59 PM, Arun C Murthy <acm@hortonworks.com>
>> wrote:
>>>>> Also, this is something we already have done i.e. we updated some of
>>> our
>>>> software deps in hadoop-2.4 v/s hadoop-2.2 - clearly not something as
>>>> dramatic as JDK. Here are some examples:
>>>>> https://issues.apache.org/jira/browse/HADOOP-9991
>>>>> https://issues.apache.org/jira/browse/HADOOP-10102
>>>>> https://issues.apache.org/jira/browse/HADOOP-10103
>>>>> https://issues.apache.org/jira/browse/HADOOP-10104
>>>>> https://issues.apache.org/jira/browse/HADOOP-10503
>>>> thanks,
>>>> Arun
>>>> --
>>>> NOTICE: This message is intended for the use of the individual or
>> entity
>>> to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>> immediately
>>>> and delete it from your system. Thank You.
>> --
>> Alejandro

View raw message