hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: Re-swizzle 2.3
Date Fri, 07 Feb 2014 17:08:07 GMT
Vinod, I have the patches to revert most of the JIRAs, the first batch,
I'll send them off line to you.

Thanks.


On Thu, Feb 6, 2014 at 8:56 PM, Vinod Kumar Vavilapalli
<vinodkv@apache.org>wrote:

>
> Thanks. please post your findings, Jian wrote this part of the code and
> between him/me, we can take care of those issues.
>
> +1 for going ahead with the revert on branch-2.3. I'll go do that tomorrow
> morning unless I hear otherwise from Jian.
>
> Thanks,
> +Vinod
>
>
> On Feb 6, 2014, at 8:28 PM, Alejandro Abdelnur <tucu@cloudera.com> wrote:
>
> > Hi Vinod,
> >
> > Nothing confidential,
> >
> > * With umanaged AMs I'm seeing the trace I've posted a couple of days ago
> > in YARN-1577 (
> >
> https://issues.apache.org/jira/browse/YARN-1577?focusedCommentId=13891853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13891853
> > ).
> >
> > * Also, Robert has been digging in Oozie testcases failing/getting suck
> > with several token renewer threads, this failures happened consistently
> at
> > different places around the same testcases (like some file descriptors
> > leaking out), reverting YARN-1490 fixes the problem. The potential issue
> > with this is that a long running client (oozie) my run into this
> situation
> > thus becoming unstable.
> >
> > *Robert,* mind posting to YARN-1490 the jvm thread dump at the time of
> test
> > hanging?
> >
> > After YARN-1493 & YARN-1490 we have a couple of JIRAs trying to fix
> issues
> > introduced by them, and we still didn't get them right.
> >
> > Because this, the improvements driven by YARN-1493 & YARN-1490 seem that
> > require more work before being stable.
> >
> > IMO, being conservative, we should do 2.3 without them and roll them with
> > 2.4. If we want to do regular releases we will have to make this kind of
> > calls, else we will start dragging the releases.
> >
> > Sounds like a plan?
> >
> > Thanks.
> >
> >
> >
> > On Thu, Feb 6, 2014 at 6:27 PM, Vinod Kumar Vavilapalli
> > <vinodkv@apache.org>wrote:
> >
> >> Hey
> >>
> >> I am not against removing them from 2.3 if that is helpful for progress.
> >> But I want to understand what the issues are before we make that
> decision.
> >>
> >> There is the issue with unmanaged AM that is clearly known and I was
> >> thinking of coming to the past two days, but couldn't. What is this new
> >> issue that we (confidently?) pinned down to YARN-1490?
> >>
> >> Thanks
> >> +Vinod
> >>
> >> On Feb 6, 2014, at 5:07 PM, Alejandro Abdelnur <tucu@cloudera.com>
> wrote:
> >>
> >>> Thanks Robert,
> >>>
> >>> All,
> >>>
> >>> So it seems that YARN-1493 and YARN-1490 are introducing serious
> >>> regressions.
> >>>
> >>> I would propose to revert them and the follow up JIRAs from the 2.3
> >> branch
> >>> and keep working on them on trunk/branch-2 until the are stable (I
> would
> >>> even prefer reverting them from branch-2 not to block a 2.4 if they are
> >> not
> >>> ready in time).
> >>>
> >>> As I've mentioned before, the list of JIRAs to revert were:
> >>>
> >>> YARN-1493
> >>> YARN-1490
> >>> YARN-1166
> >>> YARN-1041
> >>> YARN-1566
> >>>
> >>> Plus 2 additional JIRAs committed since my email on this issue 2 days
> >> ago:
> >>>
> >>> *YARN-1661
> >>> *YARN-1689 (not sure if this JIRA is related in functionality to the
> >>> previous ones but it is creating conflicts).
> >>>
> >>> I think we should hold on continuing work on top of something that is
> >>> broken until the broken stuff is fixed.
> >>>
> >>> Quoting Arun, "Committers - Henceforth, please use extreme caution
> while
> >>> committing to branch-2.3. Please commit *only* blockers to 2.3."
> >>>
> >>> YARN-1661 & YARN-1689 are not blockers.
> >>>
> >>> Unless there are objections, I'll revert all these JIRAs from
> branch-2.3
> >>> tomorrow around noon and I'll update fixedVersion in the JIRAs.
> >>>
> >>> I'm inclined to revert them from branch-2 as well.
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> On Thu, Feb 6, 2014 at 3:54 PM, Robert Kanter <rkanter@cloudera.com>
> >> wrote:
> >>>
> >>>> I think we should revert YARN-1490 from Hadoop 2.3 branch.  I think
it
> >> was
> >>>> causing some strange behavior in the Oozie unit tests:
> >>>>
> >>>> Basically, we use a single MiniMRCluster and MiniDFSCluster across all
> >> unit
> >>>> tests in a module.  With YARN-1490 we saw that, regardless of test
> >> order,
> >>>> the last few tests would timeout waiting for an MR job to finish; on
> >> slower
> >>>> machines, the entire test suite would timeout.  Through some digging,
> I
> >>>> found that we were getting a ton of "Connection refused" Exceptions
on
> >>>> LeaseRenewer talking to the NN and a few on the AM talking to the RM.
> >>>>
> >>>> After a bunch of investigation, I found that the problem went away
> once
> >>>> YARN-1490 was removed.  Though I couldn't figure out the exact
> problem.
> >>>> Even though this occurred in unit tests, it does make me concerned
> that
> >> it
> >>>> could indicate some bigger issue in a long-running real cluster (where
> >>>> everything isn't running on the same machine) that we haven't seen
> yet.
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Feb 6, 2014 at 3:06 PM, Karthik Kambatla <kasha@cloudera.com>
> >>>> wrote:
> >>>>
> >>>>> I have marked MAPREDUCE-5744 a blocker for 2.3. Committing it
> shortly.
> >>>> Will
> >>>>> pull it out of branch-2.3 if anyone objects.
> >>>>>
> >>>>>
> >>>>> On Thu, Feb 6, 2014 at 2:04 PM, Arpit Agarwal <
> >> aagarwal@hortonworks.com
> >>>>>> wrote:
> >>>>>
> >>>>>> Merged HADOOP-10273 to branch-2.3 as r1565456.
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Feb 5, 2014 at 4:49 PM, Arpit Agarwal <
> >>>> aagarwal@hortonworks.com
> >>>>>>> wrote:
> >>>>>>
> >>>>>>> IMO HADOOP-10273 (Fix 'mvn site') should be included in
2.3.
> >>>>>>>
> >>>>>>> I will merge it to branch-2.3 tomorrow PST if no one disagrees.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 4, 2014 at 5:03 PM, Alejandro Abdelnur <
> >>>> tucu@cloudera.com
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> IMO YARN-1577 is a blocker, it is breaking unmanaged
AMs in a very
> >>>> odd
> >>>>>>>> ways
> >>>>>>>> (to the point it seems un-deterministic).
> >>>>>>>>
> >>>>>>>> I'd say eiher YARN-1577 is fixed or we revert
> >>>>>>>> YARN-1493/YARN-1490/YARN-1166/YARN-1041/YARN-1566 (almost
clean
> >>>>> reverts)
> >>>>>>>> from Hadoop 2.3 branch before doing the release.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I've verified that after reverting those JIRAs things
work fine
> with
> >>>>>>>> unmanaged AMs.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Feb 4, 2014 at 11:45 AM, Arun C Murthy <
> acm@hortonworks.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I punted YARN-1444 to 2.4 since it's a long-standing
issue.
> >>>>>>>>>
> >>>>>>>>> Jian is away and I don't see YARN-1577 & YARN-1206
making much
> >>>>>> progress
> >>>>>>>>> till he is back; so I'm inclined to push both to
2.4 too. Any
> >>>>>>>> objections?
> >>>>>>>>>
> >>>>>>>>> Looks like Daryn has both HADOOP-10301 & HDFS-4564
covered.
> >>>>>>>>>
> >>>>>>>>> Overall, I'll try get this out in next couple of
days if we can
> >>>>> clear
> >>>>>>>> the
> >>>>>>>>> list.
> >>>>>>>>>
> >>>>>>>>> thanks,
> >>>>>>>>> Arun
> >>>>>>>>>
> >>>>>>>>> On Feb 3, 2014, at 12:14 PM, Arun C Murthy <acm@hortonworks.com>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> An update. Per https://s.apache.org/hadoop-2.3.0-blockers
we
> >>>> are
> >>>>>> now
> >>>>>>>>> down to 5 blockers: 1 Common, 1 HDFS, 3 YARN.
> >>>>>>>>>>
> >>>>>>>>>> Daryn (thanks!) has both the non-YARN covered.
Vinod is helping
> >>>>> out
> >>>>>>>> with
> >>>>>>>>> the YARN ones.
> >>>>>>>>>>
> >>>>>>>>>> thanks,
> >>>>>>>>>> Arun
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Arun C. Murthy
> >>>>>>>>> Hortonworks Inc.
> >>>>>>>>> http://hortonworks.com/
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> CONFIDENTIALITY NOTICE
> >>>>>>>>> NOTICE: This message is intended for the use of
the individual or
> >>>>>>>> entity to
> >>>>>>>>> which it is addressed and may contain information
that is
> >>>>>> confidential,
> >>>>>>>>> privileged and exempt from disclosure under applicable
law. If
> the
> >>>>>>>> reader
> >>>>>>>>> of this message is not the intended recipient, you
are hereby
> >>>>> notified
> >>>>>>>> that
> >>>>>>>>> any printing, copying, dissemination, distribution,
disclosure or
> >>>>>>>>> forwarding of this communication is strictly prohibited.
If you
> >>>> have
> >>>>>>>>> received this communication in error, please contact
the sender
> >>>>>>>> immediately
> >>>>>>>>> and delete it from your system. Thank You.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Alejandro
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> CONFIDENTIALITY NOTICE
> >>>>>> NOTICE: This message is intended for the use of the individual
or
> >>>> entity
> >>>>> to
> >>>>>> which it is addressed and may contain information that is
> >> confidential,
> >>>>>> privileged and exempt from disclosure under applicable law.
If the
> >>>> reader
> >>>>>> of this message is not the intended recipient, you are hereby
> notified
> >>>>> that
> >>>>>> any printing, copying, dissemination, distribution, disclosure
or
> >>>>>> forwarding of this communication is strictly prohibited. If
you have
> >>>>>> received this communication in error, please contact the sender
> >>>>> immediately
> >>>>>> and delete it from your system. Thank You.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Alejandro
> >>
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> entity to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> immediately
> >> and delete it from your system. Thank You.
> >>
> >
> >
> >
> > --
> > Alejandro
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message