hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kumar Vavilapalli <vino...@apache.org>
Subject Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
Date Wed, 13 Dec 2017 20:37:46 GMT
I was waiting for Daniel to post the minutes from YARN meetup to talk about this. Anyways,
in that discussion, we identified a bunch of key upgrade related scenarios that no-one seems
to have validated - atleast from the representation in the YARN meetup. I'm going to create
a wiki-page listing all these scenarios.

But back to the bug that Junping raised. At this point, we don't have a clear path towards
running 2.x applications on 3.0.0 clusters. So, our claim of rolling-upgrades already working
is not accurate.

One of the two options that Junping proposed should be pursued before we close the release.
I'm in favor of calling out rolling-upgrade support be with-drawn or caveated and push for
progress instead of blocking the release.

Thanks
+Vinod

> On Dec 12, 2017, at 5:44 PM, Junping Du <jdu@hortonworks.com> wrote:
> 
> Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just get chance to validate
new RC now.
> 
> Basically, I found two critical issues with the same rolling upgrade scenario as where
HADOOP-15059 get found previously:
> HDFS-12920, we changed value format for some hdfs configurations that old version MR
client doesn't understand when fetching these configurations. Some quick workarounds are to
add old value (without time unit) in hdfs-site.xml to override new default values but will
generate many annoying warnings. I provided my fix suggestions on the JIRA already for more
discussion.
> The other one is YARN-7646. After we workaround HDFS-12920, will hit the issue that old
version MR AppMaster cannot communicate with new version of YARN RM - could be related to
resource profile changes from YARN side but root cause are still in investigation.
> 
> The first issue may not belong to a blocker given we can workaround this without code
change. I am not sure if we can workaround 2nd issue so far. If not, we may have to fix this
or compromise with withdrawing support of rolling upgrade or calling it a stable release.
> 
> 
> Thanks,
> 
> Junping
> 
> ________________________________________
> From: Robert Kanter <rkanter@cloudera.com>
> Sent: Tuesday, December 12, 2017 3:10 PM
> To: Arun Suresh
> Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron T. Myers; common-dev@hadoop.apache.org;
hdfs-dev@hadoop.apache.org; yarn-dev@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
> Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
> 
> +1 (binding)
> 
> + Downloaded the binary release
> + Deployed on a 3 node cluster on CentOS 7.3
> + Ran some MR jobs, clicked around the UI, etc
> + Ran some CLI commands (yarn logs, etc)
> 
> Good job everyone on Hadoop 3!
> 
> 
> - Robert
> 
> On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh <asuresh@apache.org> wrote:
> 
>> +1 (binding)
>> 
>> - Verified signatures of the source tarball.
>> - built from source - using the docker build environment.
>> - set up a pseudo-distributed test cluster.
>> - ran basic HDFS commands
>> - ran some basic MR jobs
>> 
>> Cheers
>> -Arun
>> 
>> On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang <andrew.wang@cloudera.com>
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> As a reminder, this vote closes tomorrow at 12:31pm, so please give it a
>>> whack if you have time. There are already enough binding +1s to pass this
>>> vote, but it'd be great to get additional validation.
>>> 
>>> Thanks to everyone who's voted thus far!
>>> 
>>> Best,
>>> Andrew
>>> 
>>> 
>>> 
>>> On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu <lei@cloudera.com> wrote:
>>> 
>>>> +1 (binding)
>>>> 
>>>> * Verified src tarball and bin tarball, verified md5 of each.
>>>> * Build source with -Pdist,native
>>>> * Started a pseudo cluster
>>>> * Run ec -listPolicies / -getPolicy / -setPolicy on /  , and run hdfs
>>>> dfs put/get/cat on "/" with XOR-2-1 policy.
>>>> 
>>>> Thanks Andrew for this great effort!
>>>> 
>>>> Best,
>>>> 
>>>> 
>>>> On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang <andrew.wang@cloudera.com
>>> 
>>>> wrote:
>>>>> Hi Wei-Chiu,
>>>>> 
>>>>> The patchprocess directory is left over from the create-release
>>> process,
>>>>> and it looks empty to me. We should still file a create-release JIRA
>> to
>>>> fix
>>>>> this, but I think this is not a blocker. Would you agree?
>>>>> 
>>>>> Best,
>>>>> Andrew
>>>>> 
>>>>> On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang <
>> weichiu@cloudera.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hi Andrew, thanks the tremendous effort.
>>>>>> I found an empty "patchprocess" directory in the source tarball,
>> that
>>> is
>>>>>> not there if you clone from github. Any chance you might have some
>>>> leftover
>>>>>> trash when you made the tarball?
>>>>>> Not wanting to nitpicking, but you might want to double check so
we
>>>> don't
>>>>>> ship anything private to you in public :)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Dec 12, 2017 at 7:48 AM, Ajay Kumar <
>>> ajay.kumar@hortonworks.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> +1 (non-binding)
>>>>>>> Thanks for driving this, Andrew Wang!!
>>>>>>> 
>>>>>>> - downloaded the src tarball and verified md5 checksum
>>>>>>> - built from source with jdk 1.8.0_111-b14
>>>>>>> - brought up a pseudo distributed cluster
>>>>>>> - did basic file system operations (mkdir, list, put, cat) and
>>>>>>> confirmed that everything was working
>>>>>>> - Run word count, pi and DFSIOTest
>>>>>>> - run hdfs and yarn, confirmed that the NN, RM web UI worked
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Ajay
>>>>>>> 
>>>>>>> On 12/11/17, 9:35 PM, "Xiao Chen" <xiao@cloudera.com> wrote:
>>>>>>> 
>>>>>>>    +1 (binding)
>>>>>>> 
>>>>>>>    - downloaded src tarball, verified md5
>>>>>>>    - built from source with jdk1.8.0_112
>>>>>>>    - started a pseudo cluster with hdfs and kms
>>>>>>>    - sanity checked encryption related operations working
>>>>>>>    - sanity checked webui and logs.
>>>>>>> 
>>>>>>>    -Xiao
>>>>>>> 
>>>>>>>    On Mon, Dec 11, 2017 at 6:10 PM, Aaron T. Myers <
>> atm@apache.org>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> +1 (binding)
>>>>>>>> 
>>>>>>>> - downloaded the src tarball and built the source (-Pdist
>>>> -Pnative)
>>>>>>>> - verified the checksum
>>>>>>>> - brought up a secure pseudo distributed cluster
>>>>>>>> - did some basic file system operations (mkdir, list, put,
>> cat)
>>>> and
>>>>>>>> confirmed that everything was working
>>>>>>>> - confirmed that the web UI worked
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Aaron
>>>>>>>> 
>>>>>>>> On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang <
>>>>>>> andrew.wang@cloudera.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Let me start, as always, by thanking the efforts of all
the
>>>>>>> contributors
>>>>>>>>> who contributed to this release, especially those who
>> jumped
>>> on
>>>>>>> the
>>>>>>>> issues
>>>>>>>>> found in RC0.
>>>>>>>>> 
>>>>>>>>> I've prepared RC1 for Apache Hadoop 3.0.0. This release
>>>>>>> incorporates 302
>>>>>>>>> fixed JIRAs since the previous 3.0.0-beta1 release.
>>>>>>>>> 
>>>>>>>>> You can find the artifacts here:
>>>>>>>>> 
>>>>>>>>> http://home.apache.org/~wang/3.0.0-RC1/
>>>>>>>>> 
>>>>>>>>> I've done the traditional testing of building from the
>> source
>>>>>>> tarball and
>>>>>>>>> running a Pi job on a single node cluster. I also verified
>>> that
>>>>>>> the
>>>>>>>> shaded
>>>>>>>>> jars are not empty.
>>>>>>>>> 
>>>>>>>>> Found one issue that create-release (probably due to
the
>> mvn
>>>>>>> deploy
>>>>>>>> change)
>>>>>>>>> didn't sign the artifacts, but I fixed that by calling
mvn
>>> one
>>>>>>> more time.
>>>>>>>>> Available here:
>>>>>>>>> 
>>>>>>>>> https://repository.apache.org/
>> content/repositories/orgapache
>>>>>>> hadoop-1075/
>>>>>>>>> 
>>>>>>>>> This release will run the standard 5 days, closing on
Dec
>>> 13th
>>>> at
>>>>>>> 12:31pm
>>>>>>>>> Pacific. My +1 to start.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ------------------------------------------------------------
>>> ---------
>>>>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>>>>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lei (Eddy) Xu
>>>> Software Engineer, Cloudera
>>>> 
>>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org


Mime
View raw message