hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers" <...@cloudera.com>
Subject Re: Re-swizzle 2.3
Date Tue, 11 Feb 2014 00:50:21 GMT
Just committed a fix for HDFS-5921 to branch-2.3.

Fire away.

--
Aaron T. Myers
Software Engineer, Cloudera


On Mon, Feb 10, 2014 at 1:34 PM, Aaron T. Myers <atm@cloudera.com> wrote:

> OK. I think I should be able to get it in by 6pm PT, thanks to a quick +1
> from Andrew, but certainly don't let it hold up the train if for some
> reason it takes longer than that.
>
> --
> Aaron T. Myers
> Software Engineer, Cloudera
>
>
> On Mon, Feb 10, 2014 at 12:04 PM, Arun C Murthy <acm@hortonworks.com>wrote:
>
>> Looks like we are down to 0 blockers; I'll create rc0 tonight.
>>
>> ATM - Your call, you have until 6pm tonight to get this in.
>>
>> thanks,
>> Arun
>>
>> On Feb 10, 2014, at 11:44 AM, "Aaron T. Myers" <atm@cloudera.com> wrote:
>>
>> > I just filed an issue for the fact that browsing the FS from the NN is
>> > broken if you have a directory with the sticky bit set:
>> >
>> > https://issues.apache.org/jira/browse/HDFS-5921
>> >
>> > I didn't set this to be targeted for 2.3 because it doesn't seem like a
>> > _blocker_ to me, but if we're not going to get 2.3 out today anyway, I'd
>> > like to put this in. It's a small fix, and since many people have the
>> > sticky bit set on /tmp, they won't be able to browse any of the FS
>> > hierarchy from the NN without this fix.
>> >
>> > --
>> > Aaron T. Myers
>> > Software Engineer, Cloudera
>> >
>> >
>> > On Fri, Feb 7, 2014 at 12:45 PM, Vinod Kumar Vavilapalli <
>> vinodkv@apache.org
>> >> wrote:
>> >
>> >> Heres what I've done:
>> >> - Reverted YARN-1493,YARN-1490,YARN-1041,
>> >> YARN-1166,YARN-1566,YARN-1689,YARN-1661 from branch-2.3.
>> >> - Updated YARN's CHANGES.txt in trunk, branch-2 and branch-2.3.
>> >> - Updated these JIRAs to have 2.4 as the fix-version.
>> >> - Compiled branch-2.3.
>> >>
>> >> Let me know if you run into any issues caused by this revert.
>> >>
>> >> Thanks,
>> >> +Vinod
>> >>
>> >>
>> >> On Fri, Feb 7, 2014 at 11:41 AM, Vinod Kumar Vavilapalli <
>> >> vinodkv@apache.org
>> >>> wrote:
>> >>
>> >>> Haven't heard back from Jian. Reverting the set from branch-2.3
>> (only).
>> >> Tx
>> >>> for the offline list.
>> >>>
>> >>> +Vinod
>> >>>
>> >>>
>> >>> On Fri, Feb 7, 2014 at 9:08 AM, Alejandro Abdelnur <tucu@cloudera.com
>> >>> wrote:
>> >>>
>> >>>> Vinod, I have the patches to revert most of the JIRAs, the first
>> batch,
>> >>>> I'll send them off line to you.
>> >>>>
>> >>>> Thanks.
>> >>>>
>> >>>>
>> >>>> On Thu, Feb 6, 2014 at 8:56 PM, Vinod Kumar Vavilapalli
>> >>>> <vinodkv@apache.org>wrote:
>> >>>>
>> >>>>>
>> >>>>> Thanks. please post your findings, Jian wrote this part of the
code
>> >> and
>> >>>>> between him/me, we can take care of those issues.
>> >>>>>
>> >>>>> +1 for going ahead with the revert on branch-2.3. I'll go do
that
>> >>>> tomorrow
>> >>>>> morning unless I hear otherwise from Jian.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> +Vinod
>> >>>>>
>> >>>>>
>> >>>>> On Feb 6, 2014, at 8:28 PM, Alejandro Abdelnur <tucu@cloudera.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>> Hi Vinod,
>> >>>>>>
>> >>>>>> Nothing confidential,
>> >>>>>>
>> >>>>>> * With umanaged AMs I'm seeing the trace I've posted a couple
of
>> >> days
>> >>>> ago
>> >>>>>> in YARN-1577 (
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> https://issues.apache.org/jira/browse/YARN-1577?focusedCommentId=13891853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13891853
>> >>>>>> ).
>> >>>>>>
>> >>>>>> * Also, Robert has been digging in Oozie testcases failing/getting
>> >>>> suck
>> >>>>>> with several token renewer threads, this failures happened
>> >>>> consistently
>> >>>>> at
>> >>>>>> different places around the same testcases (like some file
>> >> descriptors
>> >>>>>> leaking out), reverting YARN-1490 fixes the problem. The
potential
>> >>>> issue
>> >>>>>> with this is that a long running client (oozie) my run into
this
>> >>>>> situation
>> >>>>>> thus becoming unstable.
>> >>>>>>
>> >>>>>> *Robert,* mind posting to YARN-1490 the jvm thread dump
at the time
>> >> of
>> >>>>> test
>> >>>>>> hanging?
>> >>>>>>
>> >>>>>> After YARN-1493 & YARN-1490 we have a couple of JIRAs
trying to fix
>> >>>>> issues
>> >>>>>> introduced by them, and we still didn't get them right.
>> >>>>>>
>> >>>>>> Because this, the improvements driven by YARN-1493 &
YARN-1490 seem
>> >>>> that
>> >>>>>> require more work before being stable.
>> >>>>>>
>> >>>>>> IMO, being conservative, we should do 2.3 without them and
roll
>> them
>> >>>> with
>> >>>>>> 2.4. If we want to do regular releases we will have to make
this
>> >> kind
>> >>>> of
>> >>>>>> calls, else we will start dragging the releases.
>> >>>>>>
>> >>>>>> Sounds like a plan?
>> >>>>>>
>> >>>>>> Thanks.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Thu, Feb 6, 2014 at 6:27 PM, Vinod Kumar Vavilapalli
>> >>>>>> <vinodkv@apache.org>wrote:
>> >>>>>>
>> >>>>>>> Hey
>> >>>>>>>
>> >>>>>>> I am not against removing them from 2.3 if that is helpful
for
>> >>>> progress.
>> >>>>>>> But I want to understand what the issues are before
we make that
>> >>>>> decision.
>> >>>>>>>
>> >>>>>>> There is the issue with unmanaged AM that is clearly
known and I
>> >> was
>> >>>>>>> thinking of coming to the past two days, but couldn't.
What is
>> this
>> >>>> new
>> >>>>>>> issue that we (confidently?) pinned down to YARN-1490?
>> >>>>>>>
>> >>>>>>> Thanks
>> >>>>>>> +Vinod
>> >>>>>>>
>> >>>>>>> On Feb 6, 2014, at 5:07 PM, Alejandro Abdelnur <tucu@cloudera.com
>> >
>> >>>>> wrote:
>> >>>>>>>
>> >>>>>>>> Thanks Robert,
>> >>>>>>>>
>> >>>>>>>> All,
>> >>>>>>>>
>> >>>>
>> >>>>>>>> So it seems that YARN-1493 and YARN-1490 are introducing
serious
>> >>>>>>>> regressions.
>> >>>>>>>>
>> >>>>>>>> I would propose to revert them and the follow up
JIRAs from the
>> >> 2.3
>> >>>>>>> branch
>> >>>>>>>> and keep working on them on trunk/branch-2 until
the are stable
>> (I
>> >>>>> would
>> >>>>>>>> even prefer reverting them from branch-2 not to
block a 2.4 if
>> >> they
>> >>>> are
>> >>>>>>> not
>> >>>>>>>> ready in time).
>> >>>>>>>>
>> >>>>>>>> As I've mentioned before, the list of JIRAs to revert
were:
>> >>>>>>>>
>> >>>>>>>> YARN-1493
>> >>>>>>>> YARN-1490
>> >>>>>>>> YARN-1166
>> >>>>>>>> YARN-1041
>> >>>>>>>> YARN-1566
>> >>>>>>>>
>> >>>>>>>> Plus 2 additional JIRAs committed since my email
on this issue 2
>> >>>> days
>> >>>>>>> ago:
>> >>>>>>>>
>> >>>>>>>> *YARN-1661
>> >>>>>>>> *YARN-1689 (not sure if this JIRA is related in
functionality to
>> >> the
>> >>>>>>>> previous ones but it is creating conflicts).
>> >>>>>>>>
>> >>>>>>>> I think we should hold on continuing work on top
of something
>> that
>> >>>> is
>> >>>>>>>> broken until the broken stuff is fixed.
>> >>>>>>>>
>> >>>>>>>> Quoting Arun, "Committers - Henceforth, please use
extreme
>> caution
>> >>>>> while
>> >>>>>>>> committing to branch-2.3. Please commit *only* blockers
to 2.3."
>> >>>>>>>>
>> >>>>>>>> YARN-1661 & YARN-1689 are not blockers.
>> >>>>>>>>
>> >>>>>>>> Unless there are objections, I'll revert all these
JIRAs from
>> >>>>> branch-2.3
>> >>>>>>>> tomorrow around noon and I'll update fixedVersion
in the JIRAs.
>> >>>>>>>>
>> >>>>>>>> I'm inclined to revert them from branch-2 as well.
>> >>>>>>>>
>> >>>>>>>> Thoughts?
>> >>>>>>>>
>> >>>>>>>> Thanks.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Thu, Feb 6, 2014 at 3:54 PM, Robert Kanter <
>> >> rkanter@cloudera.com
>> >>>>>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> I think we should revert YARN-1490 from Hadoop
2.3 branch.  I
>> >>>> think it
>> >>>>>>> was
>> >>>>>>>>> causing some strange behavior in the Oozie unit
tests:
>> >>>>>>>>>
>> >>>>>>>>> Basically, we use a single MiniMRCluster and
MiniDFSCluster
>> >> across
>> >>>> all
>> >>>>>>> unit
>> >>>>>>>>> tests in a module.  With YARN-1490 we saw that,
regardless of
>> >> test
>> >>>>>>> order,
>> >>>>>>>>> the last few tests would timeout waiting for
an MR job to
>> finish;
>> >>>> on
>> >>>>>>> slower
>> >>>>>>>>> machines, the entire test suite would timeout.
 Through some
>> >>>> digging,
>> >>>>> I
>> >>>>>>>>> found that we were getting a ton of "Connection
refused"
>> >>>> Exceptions on
>> >>>>>>>>> LeaseRenewer talking to the NN and a few on
the AM talking to
>> the
>> >>>> RM.
>> >>>>>>>>>
>> >>>>>>>>> After a bunch of investigation, I found that
the problem went
>> >> away
>> >>>>> once
>> >>>>>>>>> YARN-1490 was removed.  Though I couldn't figure
out the exact
>> >>>>> problem.
>> >>>>>>>>> Even though this occurred in unit tests, it
does make me
>> >> concerned
>> >>>>> that
>> >>>>>>> it
>> >>>>>>>>> could indicate some bigger issue in a long-running
real cluster
>> >>>> (where
>> >>>>>>>>> everything isn't running on the same machine)
that we haven't
>> >> seen
>> >>>>> yet.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Thu, Feb 6, 2014 at 3:06 PM, Karthik Kambatla
<
>> >>>> kasha@cloudera.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> I have marked MAPREDUCE-5744 a blocker for
2.3. Committing it
>> >>>>> shortly.
>> >>>>>>>>> Will
>> >>>>>>>>>> pull it out of branch-2.3 if anyone objects.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Thu, Feb 6, 2014 at 2:04 PM, Arpit Agarwal
<
>> >>>>>>> aagarwal@hortonworks.com
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Merged HADOOP-10273 to branch-2.3 as
r1565456.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Wed, Feb 5, 2014 at 4:49 PM, Arpit
Agarwal <
>> >>>>>>>>> aagarwal@hortonworks.com
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> IMO HADOOP-10273 (Fix 'mvn site')
should be included in 2.3.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I will merge it to branch-2.3 tomorrow
PST if no one
>> >> disagrees.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On Tue, Feb 4, 2014 at 5:03 PM,
Alejandro Abdelnur <
>> >>>>>>>>> tucu@cloudera.com
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> IMO YARN-1577 is a blocker,
it is breaking unmanaged AMs in
>> a
>> >>>> very
>> >>>>>>>>> odd
>> >>>>>>>>>>>>> ways
>> >>>>>>>>>>>>> (to the point it seems un-deterministic).
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I'd say eiher YARN-1577 is fixed
or we revert
>> >>>>>>>>>>>>> YARN-1493/YARN-1490/YARN-1166/YARN-1041/YARN-1566
(almost
>> >> clean
>> >>>>>>>>>> reverts)
>> >>>>>>>>>>>>> from Hadoop 2.3 branch before
doing the release.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> I've verified that after reverting
those JIRAs things work
>> >> fine
>> >>>>> with
>> >>>>>>>>>>>>> unmanaged AMs.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Thanks.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Tue, Feb 4, 2014 at 11:45
AM, Arun C Murthy <
>> >>>>> acm@hortonworks.com
>> >>>>>>>>>>
>> >>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I punted YARN-1444 to 2.4
since it's a long-standing issue.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Jian is away and I don't
see YARN-1577 & YARN-1206 making
>> >> much
>> >>>>>>>>>>> progress
>> >>>>>>>>>>>>>> till he is back; so I'm
inclined to push both to 2.4 too.
>> >> Any
>> >>>>>>>>>>>>> objections?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Looks like Daryn has both
HADOOP-10301 & HDFS-4564 covered.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Overall, I'll try get this
out in next couple of days if we
>> >>>> can
>> >>>>>>>>>> clear
>> >>>>>>>>>>>>> the
>> >>>>>>>>>>>>>> list.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> thanks,
>> >>>>>>>>>>>>>> Arun
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Feb 3, 2014, at 12:14
PM, Arun C Murthy <
>> >>>> acm@hortonworks.com>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> An update. Per
>> https://s.apache.org/hadoop-2.3.0-blockerswe
>> >>>>>>>>> are
>> >>>>>>>>>>> now
>> >>>>>>>>>>>>>> down to 5 blockers: 1 Common,
1 HDFS, 3 YARN.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Daryn (thanks!) has
both the non-YARN covered. Vinod is
>> >>>> helping
>> >>>>>>>>>> out
>> >>>>>>>>>>>>> with
>> >>>>>>>>>>>>>> the YARN ones.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> thanks,
>> >>>>>>>>>>>>>>> Arun
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>> Arun C. Murthy
>> >>>>>>>>>>>>>> Hortonworks Inc.
>> >>>>>>>>>>>>>> http://hortonworks.com/
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>> CONFIDENTIALITY NOTICE
>> >>>>>>>>>>>>>> NOTICE: This message is
intended for the use of the
>> >>>> individual or
>> >>>>>>>>>>>>> entity to
>> >>>>>>>>>>>>>> which it is addressed and
may contain information that is
>> >>>>>>>>>>> confidential,
>> >>>>>>>>>>>>>> privileged and exempt from
disclosure under applicable law.
>> >> If
>> >>>>> the
>> >>>>>>>>>>>>> reader
>> >>>>>>>>>>>>>> of this message is not the
intended recipient, you are
>> >> hereby
>> >>>>>>>>>> notified
>> >>>>>>>>>>>>> that
>> >>>>>>>>>>>>>> any printing, copying, dissemination,
distribution,
>> >>>> disclosure or
>> >>>>>>>>>>>>>> forwarding of this communication
is strictly prohibited. If
>> >>>> you
>> >>>>>>>>> have
>> >>>>>>>>>>>>>> received this communication
in error, please contact the
>> >>>> sender
>> >>>>>>>>>>>>> immediately
>> >>>>>>>>>>>>>> and delete it from your
system. Thank You.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> --
>> >>>>>>>>>>>>> Alejandro
>> >>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> --
>> >>>>>>>>>>> CONFIDENTIALITY NOTICE
>> >>>>>>>>>>> NOTICE: This message is intended for
the use of the individual
>> >> or
>> >>>>>>>>> entity
>> >>>>>>>>>> to
>> >>>>>>>>>>> which it is addressed and may contain
information that is
>> >>>>>>> confidential,
>> >>>>>>>>>>> privileged and exempt from disclosure
under applicable law. If
>> >>>> the
>> >>>>>>>>> reader
>> >>>>>>>>>>> of this message is not the intended
recipient, you are hereby
>> >>>>> notified
>> >>>>>>>>>> that
>> >>>>>>>>>>> any printing, copying, dissemination,
distribution, disclosure
>> >> or
>> >>>>>>>>>>> forwarding of this communication is
strictly prohibited. If
>> you
>> >>>> have
>> >>>>>>>>>>> received this communication in error,
please contact the
>> sender
>> >>>>>>>>>> immediately
>> >>>>>>>>>>> and delete it from your system. Thank
You.
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Alejandro
>> >>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> CONFIDENTIALITY NOTICE
>> >>>>>>> NOTICE: This message is intended for the use of the
individual or
>> >>>>> entity to
>> >>>>>>> which it is addressed and may contain information that
is
>> >>>> confidential,
>> >>>>>>> privileged and exempt from disclosure under applicable
law. If the
>> >>>>> reader
>> >>>>>>> of this message is not the intended recipient, you are
hereby
>> >>>> notified
>> >>>>> that
>> >>>>>>> any printing, copying, dissemination, distribution,
disclosure or
>> >>>>>>> forwarding of this communication is strictly prohibited.
If you
>> >> have
>> >>>>>>> received this communication in error, please contact
the sender
>> >>>>> immediately
>> >>>>>>> and delete it from your system. Thank You.
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Alejandro
>> >>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> CONFIDENTIALITY NOTICE
>> >>>>> NOTICE: This message is intended for the use of the individual
or
>> >>>> entity to
>> >>>>> which it is addressed and may contain information that is
>> >> confidential,
>> >>>>> privileged and exempt from disclosure under applicable law.
If the
>> >>>> reader
>> >>>>> of this message is not the intended recipient, you are hereby
>> notified
>> >>>> that
>> >>>>> any printing, copying, dissemination, distribution, disclosure
or
>> >>>>> forwarding of this communication is strictly prohibited. If
you have
>> >>>>> received this communication in error, please contact the sender
>> >>>> immediately
>> >>>>> and delete it from your system. Thank You.
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Alejandro
>> >>>>
>> >>>
>> >>>
>> >>
>> >> --
>> >> CONFIDENTIALITY NOTICE
>> >> NOTICE: This message is intended for the use of the individual or
>> entity to
>> >> which it is addressed and may contain information that is confidential,
>> >> privileged and exempt from disclosure under applicable law. If the
>> reader
>> >> of this message is not the intended recipient, you are hereby notified
>> that
>> >> any printing, copying, dissemination, distribution, disclosure or
>> >> forwarding of this communication is strictly prohibited. If you have
>> >> received this communication in error, please contact the sender
>> immediately
>> >> and delete it from your system. Thank You.
>> >>
>>
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message