Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of tucu@cloudera.com designates
 209.85.160.41 as permitted sender)
Content-Type: multipart/alternative;
 boundary=Apple-Mail-41FD25D5-EBA3-43E5-AD1F-D42702B583C7
Content-Transfer-Encoding: 7bit
Subject: Re: Job end notification does not always work (Hadoop 2.x)
References: 
 <CAOehgTkSbX0UfPMF3AetswkScVc=3AtUtYnhJqA=VWtCe5R=QA@mail.gmail.com>
 <1371937095.36328.YahooMailNeo@web141205.mail.bf1.yahoo.com>
 <CAOehgTkZnGU7U1SxK2n3JG=tkSAmARiL62Z8b7rQ2SPwbJ6xbQ@mail.gmail.com>
 <A49BBA22-A0DA-43BE-83F9-7F55F6E10FEC@hortonworks.com>
 <CAJs-t7O11sQL08uyWttCQ+SQgWEB_zZsVSxukOwwaox7wckOTA@mail.gmail.com>
 <1371994239.84186.YahooMailNeo@web141205.mail.bf1.yahoo.com>
 <06006DDA5A27D541991944AC4117E7A96E1C83C5@szxeml560-mbx.china.huawei.com>
From: Alejandro Abdelnur <tucu@cloudera.com>
Mime-Version: 1.0 (1.0)
In-Reply-To: 
 <06006DDA5A27D541991944AC4117E7A96E1C83C5@szxeml560-mbx.china.huawei.com>
Message-Id: <3AEF07D1-53AF-4D07-AB65-7072FBD65193@cloudera.com>
Date: Mon, 24 Jun 2013 09:11:46 -0700
Cc: "user@hadoop.apache.org" <user@hadoop.apache.org>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>


--Apple-Mail-41FD25D5-EBA3-43E5-AD1F-D42702B583C7
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable

if we ought to do this in a yarn service it=20
should be the RM or the HS. the RM is, IMO, the natural fit. the HS, would b=
e a good choice if we are concerned about the extra work this would cause in=
 the RM. the problem with the current HS is that it is MR specific, we shoul=
d generalize it for diff AM types.=20

thx

Alejandro
(phone typing)

On Jun 23, 2013, at 23:28, Devaraj k <devaraj.k@huawei.com> wrote:

> Even if we handle all the failure cases in AM for Job End Notification, we=
 may miss cases like abrupt kill of AM when it is in last retry. If we choos=
e NM to give the notification, again RM needs to identify which NM should gi=
ve the end-notification as we don't have any direct protocol between AM and N=
M.
> =20
> I feel it would be better to move End-Notification responsibility to RM as=
 Yarn Service because it ensures 100% notification and also useful for other=
 types of applications as well.
> =20
> =20
> Thanks
> Devaraj K
> =20
> From: Ravi Prakash [mailto:ravihoo@ymail.com]=20
> Sent: 23 June 2013 19:01
> To: user@hadoop.apache.org
> Subject: Re: Job end notification does not always work (Hadoop 2.x)
> =20
> Hi Alejandro,
>=20
> Thanks for your reply! I was thinking more along the lines Prashant sugges=
ted i.e. a failure during init() should still trigger an attempt to notify (=
by the AM). But now that you mention it, maybe we would be better of includi=
ng this as a YARN feature after all (specially with all the new AMs being wr=
itten). We could let the NM of the AM handle the notification burden, so tha=
t the RM doesn't get unduly taxed. Thoughts?
>=20
> Thanks
> Ravi
> =20
> =20
> From: Alejandro Abdelnur <tucu@cloudera.com>
> To: "common-user@hadoop.apache.org" <user@hadoop.apache.org>=20
> Sent: Saturday, June 22, 2013 7:37 PM
> Subject: Re: Job end notification does not always work (Hadoop 2.x)
> =20
> If the AM fails before doing the job end notification, at any stage of the=
 execution for whatever reason, the job end notification will never be deliv=
er. There is not way to fix this unless the notification is done by a Yarn s=
ervice. The 2 'candidate' services for doing this would be the RM and the HS=
. The job notification URL is in the job conf. The RM never sees the job con=
f, that rules out the RM out unless we add, at AM registration time the poss=
ibility to specify a callback URL. The HS has access to the job conf, but th=
e HS is currently a 'passive' service.
>=20
> thx
> =20
> On Sat, Jun 22, 2013 at 3:48 PM, Arun C Murthy <acm@hortonworks.com> wrote=
:
> Prashanth,=20
> =20
>  Please file a jira.
> =20
>  One thing to be aware of - AMs get restarted a certain number of times fo=
r fault-tolerance - which means we can't just assume that failure of a singl=
e AM is equivalent to failure of the job.
> =20
>  Only the ResourceManager is in the appropriate position to judge failure o=
f AM v/s failure-of-job.
> =20
> hth,
> Arun
> =20
> On Jun 22, 2013, at 2:44 PM, Prashant Kommireddi <prash1784@gmail.com> wro=
te:
>=20
>=20
> Thanks Ravi.
>=20
> Well, in this case its a no-effort :) A failure of AM init should be consi=
dered as failure of the job? I looked at the code and best-effort makes sens=
e with respect to retry logic etc. You make a good point that there would be=
 no notification in case AM OOMs, but I do feel AM init failure should send a=
 notification by other means.
>=20
> =20
>=20
> On Sat, Jun 22, 2013 at 2:38 PM, Ravi Prakash <ravihoo@ymail.com> wrote:
> Hi Prashant,
>=20
> I would tend to agree with you. Although job-end notification is only a "b=
est-effort" mechanism (i.e. we cannot always guarantee notification for exam=
ple when the AM OOMs), I agree with you that we can do more. If you feel str=
ongly about this, please create a JIRA and possibly upload a patch.
>=20
> Thanks
> Ravi
> =20
> =20
> From: Prashant Kommireddi <prash1784@gmail.com>
> To: "user@hadoop.apache.org" <user@hadoop.apache.org>=20
> Sent: Thursday, June 20, 2013 9:45 PM
> Subject: Job end notification does not always work (Hadoop 2.x)
> =20
> Hello,
>=20
> I came across an issue that occurs with the job notification callbacks in M=
R2. It works fine if the Application master has started, but does not send a=
 callback if the initializing of AM fails.
>=20
> Here is the code from MRAppMaster.java
>=20
> .....
> .......
>       // set job classloader if configured
>       MRApps.setJobClassLoader(conf);
>       initAndStartAppMaster(appMaster, conf, jobUserName);
>     } catch (Throwable t) {
>       LOG.fatal("Error starting MRAppMaster", t);
>       System.exit(1);
>     }
>   }
>=20
> protected static void initAndStartAppMaster(final MRAppMaster appMaster,
>       final YarnConfiguration conf, String jobUserName) throws IOException=
,
>       InterruptedException {
>     UserGroupInformation.setConfiguration(conf);
>     UserGroupInformation appMasterUgi =3D UserGroupInformation
>         .createRemoteUser(jobUserName);
>     appMasterUgi.doAs(new PrivilegedExceptionAction<Object>() {
>       @Override
>       public Object run() throws Exception {
>         appMaster.init(conf);
>         appMaster.start();
>         if(appMaster.errorHappenedShutDown) {
>           throw new IOException("Was asked to shut down.");
>         }
>         return null;
>       }
>     });
>   }
> appMaster.init(conf) does not dispatch JobFinishEventHandler which is resp=
onsible for sending a HTTP callback (via shutDownJob()). If there was an exc=
eption at this time, the process would simply terminate (via System.exit(1) )=

>=20
> appMaster.start() however rightly uses the JobFinishEventHandler and thing=
s work fine.
>=20
> Shouldn't a failure on init(..) also send a callback suggesting the job fa=
iled?
>=20
> Thanks,
> Prashant
> =20
> =20
>=20
> =20
> =20
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>=20
> =20
>=20
>=20
> =20
> --=20
> Alejandro
> =20

--Apple-Mail-41FD25D5-EBA3-43E5-AD1F-D42702B583C7
Content-Type: text/html;
	charset=utf-8
Content-Transfer-Encoding: 7bit

<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>if we ought to do this in a yarn service it&nbsp;<br>should be the RM or the HS. the RM is, IMO, the natural fit. the HS, would be a good choice if we are concerned about the extra work this would cause in the RM. the problem with the current HS is that it is MR specific, we should generalize it for diff AM types.&nbsp;</div><div><br></div><div>thx</div><div><br>Alejandro<div>(phone typing)</div></div><div><br>On Jun 23, 2013, at 23:28, Devaraj k &lt;<a href="mailto:devaraj.k@huawei.com">devaraj.k@huawei.com</a>&gt; wrote:<br><br></div><blockquote type="cite"><div>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 12 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
	{font-family:Helvetica;
	panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:Consolas;
	panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
pre
	{mso-style-priority:99;
	mso-style-link:"HTML Preformatted Char";
	margin:0cm;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
	{mso-style-priority:99;
	mso-style-link:"Balloon Text Char";
	margin:0cm;
	margin-bottom:.0001pt;
	font-size:8.0pt;
	font-family:"Tahoma","sans-serif";}
span.HTMLPreformattedChar
	{mso-style-name:"HTML Preformatted Char";
	mso-style-priority:99;
	mso-style-link:"HTML Preformatted";
	font-family:Consolas;}
span.BalloonTextChar
	{mso-style-name:"Balloon Text Char";
	mso-style-priority:99;
	mso-style-link:"Balloon Text";
	font-family:"Tahoma","sans-serif";}
span.EmailStyle21
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->


<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">Even if we handle all the failure cases in AM for Job End Notification, we may miss cases like abrupt kill of AM when it is in last retry. If we choose NM to
 give the notification, again RM needs to identify which NM should give the end-notification as we don't have any direct protocol between AM and NM.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">I feel it would be better to move End-Notification responsibility to RM as Yarn Service because it ensures 100% notification and also useful for other types
 of applications as well. <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">Thanks<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D">Devaraj K<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"><o:p>&nbsp;</o:p></span></p>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span style="font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> Ravi Prakash [<a href="mailto:ravihoo@ymail.com">mailto:ravihoo@ymail.com</a>]
<br>
<b>Sent:</b> 23 June 2013 19:01<br>
<b>To:</b> <a href="mailto:user@hadoop.apache.org">user@hadoop.apache.org</a><br>
<b>Subject:</b> Re: Job end notification does not always work (Hadoop 2.x)<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black">Hi Alejandro,<br>
<br>
Thanks for your reply! I was thinking more along the lines Prashant suggested i.e. a failure during init() should still trigger an attempt to notify (by the AM). But now that you mention it, maybe we would be better of including this as a YARN feature after
 all (specially with all the new AMs being written). We could let the NM of the AM handle the notification burden, so that the RM doesn't get unduly taxed. Thoughts?<br>
<br>
Thanks<br>
Ravi<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<div>
<div>
<div class="MsoNormal" align="center" style="text-align:center;background:white">
<span style="color:black">
<hr size="1" width="100%" align="center">
</span></div>
<p class="MsoNormal" style="background:white"><b><span style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:black">From:</span></b><span style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:black"> Alejandro Abdelnur &lt;<a href="mailto:tucu@cloudera.com">tucu@cloudera.com</a>&gt;<br>
<b>To:</b> "<a href="mailto:common-user@hadoop.apache.org">common-user@hadoop.apache.org</a>" &lt;<a href="mailto:user@hadoop.apache.org">user@hadoop.apache.org</a>&gt;
<br>
<b>Sent:</b> Saturday, June 22, 2013 7:37 PM<br>
<b>Subject:</b> Re: Job end notification does not always work (Hadoop 2.x)</span><span style="color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
<div id="yiv4640035831">
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">If the AM fails before doing the job end notification, at any stage of the execution for whatever reason, the job end notification will never be deliver. There is not way to fix this unless
 the notification is done by a Yarn service. The 2 'candidate' services for doing this would be the RM and the HS. The job notification URL is in the job conf. The RM never sees the job conf, that rules out the RM out unless we add, at AM registration time
 the possibility to specify a callback URL. The HS has access to the job conf, but the HS is currently a 'passive' service.<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><br>
thx<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">On Sat, Jun 22, 2013 at 3:48 PM, Arun C Murthy &lt;<a href="mailto:acm@hortonworks.com" target="_blank">acm@hortonworks.com</a>&gt; wrote:<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">Prashanth,&nbsp;<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">&nbsp;Please file a jira.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">&nbsp;One thing to be aware of - AMs get restarted a certain number of times for fault-tolerance - which means we can't just assume that failure of a single AM is equivalent to failure of the
 job.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">&nbsp;Only the ResourceManager is in the appropriate position to judge failure of AM v/s failure-of-job.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">hth,<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">Arun<o:p></o:p></span></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
<div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">On Jun 22, 2013, at 2:44 PM, Prashant Kommireddi &lt;<a href="mailto:prash1784@gmail.com" target="_blank">prash1784@gmail.com</a>&gt; wrote:<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black"><br>
<br>
<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">Thanks Ravi.<br>
<br>
Well, in this case its a no-effort :) A failure of AM init should be considered as failure of the job? I looked at the code and best-effort makes sense with respect to retry logic etc. You make a good point that there would be no notification in case AM OOMs,
 but I do feel AM init failure should send a notification by other means.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black">On Sat, Jun 22, 2013 at 2:38 PM, Ravi Prakash &lt;<a href="mailto:ravihoo@ymail.com" target="_blank">ravihoo@ymail.com</a>&gt; wrote:<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black">Hi Prashant,<br>
<br>
I would tend to agree with you. Although job-end notification is only a "best-effort" mechanism (i.e. we cannot always guarantee notification for example when the AM OOMs), I agree with you that we can do more. If you feel strongly about this, please create
 a JIRA and possibly upload a patch.<br>
<br>
Thanks<br>
Ravi<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:10.0pt;color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<div>
<div>
<div>
<div class="MsoNormal" align="center" style="text-align:center;background:white">
<span style="color:black">
<hr size="1" width="100%" align="center">
</span></div>
<p class="MsoNormal" style="background:white"><b><span style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:black">From:</span></b><span style="font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:black"> Prashant Kommireddi &lt;<a href="mailto:prash1784@gmail.com" target="_blank">prash1784@gmail.com</a>&gt;<br>
<b>To:</b> "<a href="mailto:user@hadoop.apache.org" target="_blank">user@hadoop.apache.org</a>" &lt;<a href="mailto:user@hadoop.apache.org" target="_blank">user@hadoop.apache.org</a>&gt;
<br>
<b>Sent:</b> Thursday, June 20, 2013 9:45 PM<br>
<b>Subject:</b> Job end notification does not always work (Hadoop 2.x)</span><span style="color:black"><o:p></o:p></span></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">Hello,<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">I came across an issue that occurs with the job notification callbacks in MR2. It works fine if the Application master has started, but does not send a callback if
 the initializing of AM fails.<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black">Here is the code from MRAppMaster.java<br>
<br>
.....<br>
.......<o:p></o:p></span></p>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // set job classloader if configured<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MRApps.setJobClassLoader(conf);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; initAndStartAppMaster(appMaster, conf, jobUserName);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; } catch (Throwable t) {<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; LOG.fatal("Error starting MRAppMaster", t);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.exit(1);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; }<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp; }<br><br>protected static void initAndStartAppMaster(final MRAppMaster appMaster,<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; final YarnConfiguration conf, String jobUserName) throws IOException,<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; InterruptedException {<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; UserGroupInformation.setConfiguration(conf);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; UserGroupInformation appMasterUgi = UserGroupInformation<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .createRemoteUser(jobUserName);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; appMasterUgi.doAs(new PrivilegedExceptionAction&lt;Object&gt;() {<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; @Override<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; public Object run() throws Exception {<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; appMaster.init(conf);<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; appMaster.start();<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if(appMaster.errorHappenedShutDown) {<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; throw new IOException("Was asked to shut down.");<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return null;<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp;&nbsp;&nbsp; });<o:p></o:p></span></pre>
<pre style="background:white"><span style="color:black">&nbsp; }<o:p></o:p></span></pre>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">appMaster.init(conf) does not dispatch JobFinishEventHandler which is responsible for sending a HTTP callback (via shutDownJob()). If there was an exception at this
 time, the process would simply terminate (via System.exit(1) )<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">appMaster.start() however rightly uses the JobFinishEventHandler and things work fine.<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black">Shouldn't a failure on init(..) also send a callback suggesting the job failed?<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black">Thanks,<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black">Prashant<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:13.5pt;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;;color:black">--<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="background:white"><span style="font-size:13.5pt;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;;color:black">Arun C. Murthy<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:13.5pt;background:white"><span style="font-size:13.5pt;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;;color:black">Hortonworks Inc.<br>
<a href="http://hortonworks.com/" target="_blank">http://hortonworks.com/</a><o:p></o:p></span></p>
</div>
</div>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black"><br>
<br clear="all">
<o:p></o:p></span></p>
<div>
<p class="MsoNormal" style="background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="color:black">-- <br>
Alejandro <o:p></o:p></span></p>
</div>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt;background:white"><span style="color:black"><o:p>&nbsp;</o:p></span></p>
</div>
</div>
</div>
</div>
</div>


</div></blockquote></body></html>
--Apple-Mail-41FD25D5-EBA3-43E5-AD1F-D42702B583C7--