flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Usage of Hadoop 2.2.0
Date Fri, 04 Sep 2015 07:33:50 GMT
I am good with that as well. Mind that we are not only dropping a binary
distribution for Hadoop 2.2.0, but also the source compatibility with 2.2.0.



Lets also reconfigure Travis to test

 - Hadoop1
 - Hadoop 2.3
 - Hadoop 2.4
 - Hadoop 2.6
 - Hadoop 2.7


On Fri, Sep 4, 2015 at 6:19 AM, Chiwan Park <chiwanpark@apache.org> wrote:

> +1 for dropping Hadoop 2.2.0
>
> Regards,
> Chiwan Park
>
> > On Sep 4, 2015, at 5:58 AM, Ufuk Celebi <uce@apache.org> wrote:
> >
> > +1 to what Robert said.
> >
> > On Thursday, September 3, 2015, Robert Metzger <rmetzger@apache.org>
> wrote:
> > I think most cloud providers moved beyond Hadoop 2.2.0.
> > Google's Click-To-Deploy is on 2.4.1
> > AWS EMR is on 2.6.0
> >
> > The situation for the distributions seems to be the following:
> > MapR 4 uses Hadoop 2.4.0 (current is MapR 5)
> > CDH 5.0 uses 2.3.0 (the current CDH release is 5.4)
> >
> > HDP 2.0  (October 2013) is using 2.2.0
> > HDP 2.1 (April 2014) uses 2.4.0 already
> >
> > So both vendors and cloud providers are multiple releases away from
> Hadoop 2.2.0.
> >
> > Spark does not offer a binary distribution lower than 2.3.0.
> >
> > In addition to that, I don't think that the HDFS client in 2.2.0 is
> really usable in production environments. Users were reporting
> ArrayIndexOutOfBounds exceptions for some jobs, I also had these exceptions
> sometimes.
> >
> > The easiest approach  to resolve this issue would be  (a) dropping the
> support for Hadoop 2.2.0
> > An alternative approach (b) would be:
> >  - ship a binary version for Hadoop 2.3.0
> >  - make the source of Flink still compatible with 2.2.0, so that users
> can compile a Hadoop 2.2.0 version if needed.
> >
> > I would vote for approach (a).
> >
> >
> > On Tue, Sep 1, 2015 at 5:01 PM, Till Rohrmann <trohrmann@apache.org>
> wrote:
> > While working on high availability (HA) for Flink's YARN execution I
> stumbled across some limitations with Hadoop 2.2.0. From version 2.2.0 to
> 2.3.0, Hadoop introduced new functionality which is required for an
> efficient HA implementation. Therefore, I was wondering whether there is
> actually a need to support Hadoop 2.2.0. Is Hadoop 2.2.0 still actively
> used by someone?
> >
> > Cheers,
> > Till
> >
>
>
>
>
>
>

Mime
View raw message