flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@apache.org>
Subject Re: Usage of Hadoop 2.2.0
Date Fri, 04 Sep 2015 11:01:48 GMT
+1 for dropping

On 09/04/2015 11:04 AM, Maximilian Michels wrote:
> +1 for dropping Hadoop 2.2.0 binary and source-compatibility. The
> release is hardly used and complicates the important high-availability
> changes in Flink.
> 
> On Fri, Sep 4, 2015 at 9:33 AM, Stephan Ewen <sewen@apache.org> wrote:
>> I am good with that as well. Mind that we are not only dropping a binary
>> distribution for Hadoop 2.2.0, but also the source compatibility with 2.2.0.
>>
>>
>>
>> Lets also reconfigure Travis to test
>>
>>  - Hadoop1
>>  - Hadoop 2.3
>>  - Hadoop 2.4
>>  - Hadoop 2.6
>>  - Hadoop 2.7
>>
>>
>> On Fri, Sep 4, 2015 at 6:19 AM, Chiwan Park <chiwanpark@apache.org> wrote:
>>>
>>> +1 for dropping Hadoop 2.2.0
>>>
>>> Regards,
>>> Chiwan Park
>>>
>>>> On Sep 4, 2015, at 5:58 AM, Ufuk Celebi <uce@apache.org> wrote:
>>>>
>>>> +1 to what Robert said.
>>>>
>>>> On Thursday, September 3, 2015, Robert Metzger <rmetzger@apache.org>
>>>> wrote:
>>>> I think most cloud providers moved beyond Hadoop 2.2.0.
>>>> Google's Click-To-Deploy is on 2.4.1
>>>> AWS EMR is on 2.6.0
>>>>
>>>> The situation for the distributions seems to be the following:
>>>> MapR 4 uses Hadoop 2.4.0 (current is MapR 5)
>>>> CDH 5.0 uses 2.3.0 (the current CDH release is 5.4)
>>>>
>>>> HDP 2.0  (October 2013) is using 2.2.0
>>>> HDP 2.1 (April 2014) uses 2.4.0 already
>>>>
>>>> So both vendors and cloud providers are multiple releases away from
>>>> Hadoop 2.2.0.
>>>>
>>>> Spark does not offer a binary distribution lower than 2.3.0.
>>>>
>>>> In addition to that, I don't think that the HDFS client in 2.2.0 is
>>>> really usable in production environments. Users were reporting
>>>> ArrayIndexOutOfBounds exceptions for some jobs, I also had these exceptions
>>>> sometimes.
>>>>
>>>> The easiest approach  to resolve this issue would be  (a) dropping the
>>>> support for Hadoop 2.2.0
>>>> An alternative approach (b) would be:
>>>>  - ship a binary version for Hadoop 2.3.0
>>>>  - make the source of Flink still compatible with 2.2.0, so that users
>>>> can compile a Hadoop 2.2.0 version if needed.
>>>>
>>>> I would vote for approach (a).
>>>>
>>>>
>>>> On Tue, Sep 1, 2015 at 5:01 PM, Till Rohrmann <trohrmann@apache.org>
>>>> wrote:
>>>> While working on high availability (HA) for Flink's YARN execution I
>>>> stumbled across some limitations with Hadoop 2.2.0. From version 2.2.0 to
>>>> 2.3.0, Hadoop introduced new functionality which is required for an
>>>> efficient HA implementation. Therefore, I was wondering whether there is
>>>> actually a need to support Hadoop 2.2.0. Is Hadoop 2.2.0 still actively used
>>>> by someone?
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>
>>>
>>>
>>>
>>>
>>


Mime
View raw message