hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Plans of moving towards JDK7 in trunk
Date Sat, 21 Jun 2014 17:13:51 GMT
On 21 June 2014 08:01, Andrew Wang <andrew.wang@cloudera.com> wrote:

> Hi Steve, let me confirm that I understand your proposal correctly:
> - Release an intermediate Hadoop 3 a few months out, based on JDK7 and with
> bumped library versions
> - Release a Hadoop 4 mid next year, based on JDK8
> I question the utility of an intermediate Hadoop 3 like this. Assuming that
> it gets out in September (i.e. roughly when a 2.6 would land), we're
> looking at a valid lifespan of about 7 months before JDK7 is EOL in April.
> If this release also breaks compatibility by changing library versions,
> then it looks less and less appealing from a user perspective. I suspect it
> would end up seeing low adoption as everyone waits (at most) 7 months for
> the JDK8-based release to emerge.

I'm saying that we'd replace hadooop 2.6 with a 3.x release that, along
with the 2.6 changes, ups the java version and the JARs and dependencies
which we are frozen with in Hadoop 2.x

this issue of dependencies may not be so visible in hadoop's own codebase,
but when you write any downstream project, the majority of the xml
<clauses> in your POM file is about excluding stuff Hadoop pulls in. I've
been quietly trying to address this at HADOOP-9991, but we've reached the
limit of what can get in.

I'd be happy enough with the original "Stata Plan": a release of Hadoop 2.x
that says "java 7 + new libs", but given we've committed to not doing that,
releasing a Hadoop 3 stating that lets us get a hadoop with a modern set of
underpinnings out in 2014

> I'd be more okay with an intermediate release with no incompatible changes
> whatsoever besides bumping the JDK requirement to JDK7. However, it'd still
> be a weak release considering that branch-2 already runs fine on JDK7, and
> it looks somewhat bad publicly as we burn another major release number less
> than a year since 2.x going GA.

it'll be > 1 year for 2.x to 3,

And to be realistic, the move to java 8+ across the entire hadoop stack
will probably take 1y too.

> This is why I'd like to keep my original proposal on the table: keep going
> with branch-2 in the near term, while working towards a JDK8-based Hadoop 3
> by April next year. It doesn't need to be a big bang release either. I'd be
> delighted if we could rolling upgrade from one to the other. I just didn't
> want to rule out the inclusion of some very compelling feature outright.
> Trust me though, I'd be the first person to ask about compatibility if such
> a feature does come up.
> I'll also posit that people will shy away from using JDK8 features while
> branch-2 remains in active use. There's definitely some new shiny there,
> but nothing compelling enough to me personally when weighed against the
> pain of harder branch-2 backports.

branch 2 would be frozen and tell everyone "move to java 7+", everything
downstream gets updated binaries and a chance to move forwards.

There's another issue, which is one Alejandro highlit:

---------- Forwarded message ----------
From: Alejandro Abdelnur <tucu@cloudera.com>
Date: 10 April 2014 10:30
Subject: Re: Plans of moving towards JDK7 in trunk
To: "common-dev@hadoop.apache.org" <common-dev@hadoop.apache.org>

A bit of a different angle.

As the bottom of the stack Hadoop has to be conservative in adopting
things, but it should not preclude consumers of Hadoop (downstream projects
and Hadoop application developers) to have additional requirements such as
a higher JDK API than JDK6.

Hadoop 2.x should stick to using JDK6  API
Hadoop 2.x should be tested with multiple runtimes: JDK6, JDK7 and
eventually JDK8
Downstream projects and Hadoop application developers are free to require
any JDK6+ version for development and runtime.

Hadoop 3.x should allow using JDK7 API, bumping the minimum runtime
requirement to JDK7 and be tested with JDK7 and JDK8 runtimes.

---------- Forwarded message ----------

The minimum version of Java that Hadoop mandates is going to be the minimum
version of Java that the entire stack has to adopt, and the minimum version
of Java that has to be run in the datacentre.

I wonder about how easily it will be for us all to go to the big hadoop
sites and say "java 8+ only", as well as to all those Hadoop projects that
want to run on java 7 and say "upgrade time". I think we'll hit a lot of
inertia -and, to be fair- it's due to Hadoop core's long-standing support
for Java 6. If Hadoop 2.x had always been java7+ it would be simpler, but
we all know the trauma of getting hadoop 2.2 out the door and our lack of
enthusiasm for any major dependency updates apart from the protobuf one.

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message