Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: local policy includes SPF record at
 spf.trusted-forwarder.org)
From: Chris Schneider <Schmed@TransPac.com>
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_FB0F8C0F-C794-4D44-9C6D-08439D4A9F13"
Subject: Re: A question about Hadoop 1 job user id used for group mapping,
 which could lead to performance degradatioin
Date: Tue, 18 Feb 2014 17:54:19 -0800
In-Reply-To: 
 <CAAwryDXGTBBfezufOxQ_wSvJFOX9Yuv513KE5pOBvwRZnXY7wA@mail.gmail.com>
To: user@hadoop.apache.org
References: 
 <CAAwryDXLv5A0be6--71a8+q=m8W_GoU8CunZZLHXLKg5ENXSRA@mail.gmail.com>
 <58319D9D-AE00-4DE0-AA48-BF76CEEA67DB@hortonworks.com>
 <CAAwryDVyNRZ_MbrXzv6XJPjV_V65K3EM0=nNuYM57QuL=hYqRg@mail.gmail.com>
 <CAAwryDXGTBBfezufOxQ_wSvJFOX9Yuv513KE5pOBvwRZnXY7wA@mail.gmail.com>
Message-Id: <CE0AE0ED-B08A-42AE-99F5-6813F28566B3@TransPac.com>


--Apple-Mail=_FB0F8C0F-C794-4D44-9C6D-08439D4A9F13
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1

Hi John,

My AWS Elastic MapReduce NameNode is also filling its log file with =
messages like the following:

2014-02-18 23:56:52,344 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 78 =
on 9000): No groups available for user job_201402182309_0073
2014-02-18 23:56:52,351 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 48 =
on 9000): No groups available for user job_201402182309_0073
2014-02-18 23:56:52,356 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 38 =
on 9000): No groups available for user job_201402182309_0073

I ran into this same issue in March 2013 and got past it by using an =
m1.xlarge master node (instead of my usual m1.large) when (like right =
now) I double my slave count (to 32 cc2.8xlarge instances) to re-import =
a lot of my input data. Using that m1.xlarge didn't prevent the NameNode =
from logging messages like this, but the beefier instance seemed to =
weather the load these messages represented better.

Unfortunately, even my m1.xlarge master node now seems overwhelmed. The =
cluster starts off fine, efficiently mowing through the jobs in my job =
flow step for a few hours, but it eventually gets into a mode where the =
copy phase of the reduce jobs appear to make no progress at all. At that =
point, the NameNode seems to be spending all of its time writing =
messages like the ones above.

The issue doesn't seem to be related to the NameNode JVM size (I tried =
increasing it to 4GB before I realized it never used more than ~400MB), =
nor dfs.namenode.handler.count (which I increased from 64 to 96).

We're currently trying to work around the problem by hacking =
log4j.properties to set the logging level for =
org.apache.hadoop.security.UserGroupInformation to ERROR. We might have =
to do so for the entire package, as I've also seen the following in the =
NameNode logs:

2014-02-19 01:01:24,184 WARN =
org.apache.hadoop.security.ShellBasedUnixGroupsMapping (IPC Server =
handler 84 on 9000): got exception trying to get groups for user =
job_201402182309_0226
org.apache.hadoop.util.Shell$ExitCodeException: id: =
job_201402182309_0226: No such user

	at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
	at org.apache.hadoop.util.Shell.run(Shell.java:182)
	at =
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
	at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
	at =
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(Shell=
BasedUnixGroupsMapping.java:78)
	at =
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBase=
dUnixGroupsMapping.java:53)
	at org.apache.hadoop.security.Groups.getGroups(Groups.java:79)
	at =
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInf=
ormation.java:1037)
	at =
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.<init>(FSPermis=
sionChecker.java:50)
	at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSName=
system.java:5218)
	at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesy=
stem.java:5201)
	at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesyst=
em.java:2030)
	at =
org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:=
850)
	at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
	at =
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm=
pl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:573)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at =
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.=
java:1132)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

I would also be very interested in hearing Jakob Homan and Deveraj Das =
respond to your analysis of the changes made for MAPREDUCE-1457.

Please post again with any further information you're able to glean =
about this problem.

Thanks,

- Chris

On Jan 8, 2014, at 1:26 PM, Jian Fang wrote:

> Looked a bit deeper and seems this code was introduced by the =
following JIRA.
>=20
> https://issues.apache.org/jira/browse/MAPREDUCE-1457
>=20
> There is another related JIRA, i.e., =
https://issues.apache.org/jira/browse/MAPREDUCE-4329.
>=20
> Perhaps, the warning message is a side effect of JIRA MAPREDUCE-1457 =
when the cluster is running in non-secured mode. There should be some =
code path that caused the job id was treated as user name in task =
tracker or job tracker. Then the job id was passed to HDFS name node. =
This is definitely a big problem since the heavy warning logs alone =
degraded the system performance in a relatively big cluster.
>=20
> This behavior is very easy to reproduce by simply running terasort on =
a cluster.=20
>=20
> Any suggestion to fix this problem?
>=20
>=20
>=20
>=20
> On Wed, Jan 8, 2014 at 11:18 AM, Jian Fang =
<jian.fang.subscribe@gmail.com> wrote:
> Thanks Vinod for your quick response. It is running in non-secure =
mode.
>=20
> I still don't get what is the purpose to use job id in UGI. Could you =
please explain a bit more?
>=20
> Thanks,
>=20
> John
>=20
>=20
> On Wed, Jan 8, 2014 at 10:11 AM, Vinod Kumar Vavilapalli =
<vinodkv@hortonworks.com> wrote:
> It just seems like lazy code. You can see that, later, there is this:
>=20
> {code}
>=20
>         for(Token<?> token : =
UserGroupInformation.getCurrentUser().getTokens()) {
>           childUGI.addToken(token);
>         }
>=20
> {code}
>=20
> So eventually the JobToken is getting added to the UGI which runs =
task-code.
>=20
> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server =
handler 63 on 9000): No groups available for user job_201401071758_0002
>=20
> This seems to be a problem. When the task tries to reach the NameNode, =
it should do so as the user, not the job-id. It is not just logging, I'd =
be surprised if jobs pass. Do you have permissions enabled on HDFS?
>=20
> Oh, or is this in non-secure mode (i.e. without kerberos)?
>=20
> +Vinod
>=20
>=20
> On Jan 7, 2014, at 5:14 PM, Jian Fang <jian.fang.subscribe@gmail.com> =
wrote:
>=20
> > Hi,
> >
> > I looked at Hadoop 1.X source code and found some logic that I could =
not understand.
> >
> > In the org.apache.hadoop.mapred.Child class, there were two UGIs =
defined as follows.
> >
> >     UserGroupInformation current =3D =
UserGroupInformation.getCurrentUser();
> >     current.addToken(jt);
> >
> >     UserGroupInformation taskOwner
> >      =3D =
UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());
> >     taskOwner.addToken(jt);
> >
> > But it is the taskOwner that is actually passed as a UGI to task =
tracker and then to HDFS. The first one was not referenced any where.
> >
> >     final TaskUmbilicalProtocol umbilical =3D
> >       taskOwner.doAs(new =
PrivilegedExceptionAction<TaskUmbilicalProtocol>() {
> >         @Override
> >         public TaskUmbilicalProtocol run() throws Exception {
> >           return =
(TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,
> >               TaskUmbilicalProtocol.versionID,
> >               address,
> >               defaultConf);
> >         }
> >     });
> >
> > What puzzled me is that the job id is actually passed in as the user =
name to task tracker. On the Name node side, when it tries to map the =
non-existing user name, i.e., task id, to a group, it always returns =
empty array. As a result, we always see annoying warning messages such =
as
> >
> >  WARN org.apache.hadoop.security.UserGroupInformation (IPC Server =
handler 63 on 9000): No groups available for user job_201401071758_0002
> >
> > Sometimes, the warning messages were thrown so fast, hundreds or =
even thousands per second for a big cluster, the system performance was =
degraded dramatically.
> >
> > Could someone please explain why this logic was designed in this =
way? Any benefit to use non-existing user for the group mapping? Or is =
this a bug?
> >
> > Thanks in advance,
> >
> > John
>=20
>=20
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or =
entity to
> which it is addressed and may contain information that is =
confidential,
> privileged and exempt from disclosure under applicable law. If the =
reader
> of this message is not the intended recipient, you are hereby notified =
that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender =
immediately
> and delete it from your system. Thank You.

-----------------------------------------
Chris Schneider
http://www.scaleunlimited.com
custom big data solutions
-----------------------------------------


--Apple-Mail=_FB0F8C0F-C794-4D44-9C6D-08439D4A9F13
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=iso-8859-1

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi =
John,<div><br></div><div>My AWS Elastic MapReduce NameNode is also =
filling its log file with messages like the =
following:</div><div><br></div><blockquote =
class=3D"webkit-indent-blockquote" style=3D"margin: 0 0 0 40px; border: =
none; padding: 0px;"><div><div>2014-02-18 23:56:52,344 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 78 =
on 9000): No groups available for user =
job_201402182309_0073</div></div><div><div>2014-02-18 23:56:52,351 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 48 =
on 9000): No groups available for user =
job_201402182309_0073</div></div><div><div>2014-02-18 23:56:52,356 WARN =
org.apache.hadoop.security.UserGroupInformation (IPC Server handler 38 =
on 9000): No groups available for user =
job_201402182309_0073</div></div></blockquote><div><div><br></div><div>I =
ran into this same issue in March 2013 and got past it by using an =
m1.xlarge master node (instead of my usual m1.large) when (like right =
now) I double my slave count (to 32 cc2.8xlarge instances) to re-import =
a lot of my input data. Using that m1.xlarge didn't prevent the NameNode =
from logging messages like this, but the beefier instance seemed to =
weather the load these messages represented =
better.</div><div><br></div><div>Unfortunately, even my m1.xlarge master =
node now seems overwhelmed. The cluster starts off fine, efficiently =
mowing through the jobs in my job flow step for a few hours, but it =
eventually gets into a mode where the copy phase of the reduce jobs =
appear to make no progress at all. At that point, the NameNode seems to =
be spending all of its time writing messages like the ones =
above.</div><div><br></div><div>The issue doesn't seem to be related to =
the NameNode JVM size (I tried increasing it to 4GB before I realized it =
never used more than ~400MB), nor&nbsp;dfs.namenode.handler.count (which =
I increased from 64 to 96).</div></div><div><br></div><div>We're =
currently trying to work around the problem by hacking log4j.properties =
to set the logging level =
for&nbsp;org.apache.hadoop.security.UserGroupInformation to ERROR. We =
might have to do so for the entire package, as I've also seen the =
following in the NameNode logs:</div><div><br></div><div><div>2014-02-19 =
01:01:24,184 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping =
(IPC Server handler 84 on 9000): got exception trying to get groups for =
user =
job_201402182309_0226</div><div>org.apache.hadoop.util.Shell$ExitCodeExcep=
tion: id: job_201402182309_0226: No such =
user</div><div><br></div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.util.Shell.runCommand(Shell.java:255)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.util.Shell.run(Shell.java:182)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)<=
/div><div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>at =
org.apache.hadoop.util.Shell.execCommand(Shell.java:461)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.util.Shell.execCommand(Shell.java:444)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(Shell=
BasedUnixGroupsMapping.java:78)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBase=
dUnixGroupsMapping.java:53)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.security.Groups.getGroups(Groups.java:79)</div><div><spa=
n class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInf=
ormation.java:1037)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.&lt;init&gt;(FS=
PermissionChecker.java:50)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSName=
system.java:5218)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkTraverse(FSNamesy=
stem.java:5201)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesyst=
em.java:2030)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.hdfs.server.namenode.NameNode.getFileInfo(NameNode.java:=
850)</div><div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	=
</span>at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown =
Source)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm=
pl.java:25)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
java.lang.reflect.Method.invoke(Method.java:597)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:573)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)</div><div><sp=
an class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)</div><div><sp=
an class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
java.security.AccessController.doPrivileged(Native =
Method)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
javax.security.auth.Subject.doAs(Subject.java:396)</div><div><span =
class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>at =
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.=
java:1132)</div><div><span class=3D"Apple-tab-span" =
style=3D"white-space:pre">	</span>at =
org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)</div></div><div=
><br></div><div>I would also be very interested in hearing Jakob Homan =
and Deveraj Das respond to your analysis of the changes made for =
MAPREDUCE-1457.</div><div><br></div><div>Please post again with any =
further information you're able to glean about this =
problem.</div><div><br></div><div>Thanks,</div><div><br></div><div>- =
Chris</div><div><br></div><div><div>On Jan 8, 2014, at 1:26 PM, Jian =
Fang wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div dir=3D"ltr"><div><div><div>Looked a bit deeper and =
seems this code was introduced by the following JIRA.<br><br><a =
href=3D"https://issues.apache.org/jira/browse/MAPREDUCE-1457">https://issu=
es.apache.org/jira/browse/MAPREDUCE-1457</a><br>
<br></div>There is another related JIRA, i.e., <a =
href=3D"https://issues.apache.org/jira/browse/MAPREDUCE-4329">https://issu=
es.apache.org/jira/browse/MAPREDUCE-4329</a>.<br><br></div>Perhaps, the =
warning message is a side effect of JIRA MAPREDUCE-1457 when the cluster =
is running in non-secured mode. There should be some code path that =
caused the job id was treated as user name in task tracker or job =
tracker. Then the job id was passed to HDFS name node. This is =
definitely a big problem since the heavy warning logs alone degraded the =
system performance in a relatively big cluster.<br>
<br></div>This behavior is very easy to reproduce by simply running =
terasort on a cluster. <br><br>Any suggestion to fix this =
problem?<br><div><br><br></div></div><div =
class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">
On Wed, Jan 8, 2014 at 11:18 AM, Jian Fang <span dir=3D"ltr">&lt;<a =
href=3D"mailto:jian.fang.subscribe@gmail.com" =
target=3D"_blank">jian.fang.subscribe@gmail.com</a>&gt;</span> =
wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir=3D"ltr"><div><div><div>Thanks Vinod for your quick response. It =
is running in non-secure mode.<br><br></div>I still don't get what is =
the purpose to use job id in UGI. Could you please explain a bit =
more?<br><br>

</div>Thanks,<br><br></div>John<br></div><div class=3D"HOEnZb"><div =
class=3D"h5"><div class=3D"gmail_extra"><br><br><div =
class=3D"gmail_quote">On Wed, Jan 8, 2014 at 10:11 AM, Vinod Kumar =
Vavilapalli <span dir=3D"ltr">&lt;<a =
href=3D"mailto:vinodkv@hortonworks.com" =
target=3D"_blank">vinodkv@hortonworks.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin-top: 0px; =
margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; =
border-left-width: 1px; border-left-color: rgb(204, 204, 204); =
border-left-style: solid; padding-left: 1ex; position: static; z-index: =
auto; ">It just seems like lazy code. You can see that, later, there is =
this:<br>
<br>
{code}<br>
<br>
&nbsp; &nbsp; &nbsp; &nbsp; for(Token&lt;?&gt; token : =
UserGroupInformation.getCurrentUser().getTokens()) {<br>
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; childUGI.addToken(token);<br>
&nbsp; &nbsp; &nbsp; &nbsp; }<br>
<br>
{code}<br>
<br>
So eventually the JobToken is getting added to the UGI which runs =
task-code.<br>
<div><br>
&gt; &nbsp;WARN org.apache.hadoop.security.UserGroupInformation (IPC =
Server handler 63 on 9000): No groups available for user =
job_201401071758_0002<br>
<br>
</div>This seems to be a problem. When the task tries to reach the =
NameNode, it should do so as the user, not the job-id. It is not just =
logging, I'd be surprised if jobs pass. Do you have permissions enabled =
on HDFS?<br>


<br>
Oh, or is this in non-secure mode (i.e. without kerberos)?<br>
<br>
+Vinod<br>
<div><div><br>
<br>
On Jan 7, 2014, at 5:14 PM, Jian Fang &lt;<a =
href=3D"mailto:jian.fang.subscribe@gmail.com" =
target=3D"_blank">jian.fang.subscribe@gmail.com</a>&gt; wrote:<br>
<br>
&gt; Hi,<br>
&gt;<br>
&gt; I looked at Hadoop 1.X source code and found some logic that I =
could not understand.<br>
&gt;<br>
&gt; In the org.apache.hadoop.mapred.Child class, there were two UGIs =
defined as follows.<br>
&gt;<br>
&gt; &nbsp; &nbsp; UserGroupInformation current =3D =
UserGroupInformation.getCurrentUser();<br>
&gt; &nbsp; &nbsp; current.addToken(jt);<br>
&gt;<br>
&gt; &nbsp; &nbsp; UserGroupInformation taskOwner<br>
&gt; &nbsp; &nbsp; &nbsp;=3D =
UserGroupInformation.createRemoteUser(firstTaskid.getJobID().toString());<=
br>
&gt; &nbsp; &nbsp; taskOwner.addToken(jt);<br>
&gt;<br>
&gt; But it is the taskOwner that is actually passed as a UGI to task =
tracker and then to HDFS. The first one was not referenced any =
where.<br>
&gt;<br>
&gt; &nbsp; &nbsp; final TaskUmbilicalProtocol umbilical =3D<br>
&gt; &nbsp; &nbsp; &nbsp; taskOwner.doAs(new =
PrivilegedExceptionAction&lt;TaskUmbilicalProtocol&gt;() {<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; @Override<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; public TaskUmbilicalProtocol run() =
throws Exception {<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return =
(TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
TaskUmbilicalProtocol.versionID,<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; address,<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; defaultConf);<br>
&gt; &nbsp; &nbsp; &nbsp; &nbsp; }<br>
&gt; &nbsp; &nbsp; });<br>
&gt;<br>
&gt; What puzzled me is that the job id is actually passed in as the =
user name to task tracker. On the Name node side, when it tries to map =
the non-existing user name, i.e., task id, to a group, it always returns =
empty array. As a result, we always see annoying warning messages such =
as<br>


&gt;<br>
&gt; &nbsp;WARN org.apache.hadoop.security.UserGroupInformation (IPC =
Server handler 63 on 9000): No groups available for user =
job_201401071758_0002<br>
&gt;<br>
&gt; Sometimes, the warning messages were thrown so fast, hundreds or =
even thousands per second for a big cluster, the system performance was =
degraded dramatically.<br>
&gt;<br>
&gt; Could someone please explain why this logic was designed in this =
way? Any benefit to use non-existing user for the group mapping? Or is =
this a bug?<br>
&gt;<br>
&gt; Thanks in advance,<br>
&gt;<br>
&gt; John<br>
<br>
<br>
</div></div><span><font color=3D"#888888">--<br>
CONFIDENTIALITY NOTICE<br>
NOTICE: This message is intended for the use of the individual or entity =
to<br>
which it is addressed and may contain information that is =
confidential,<br>
privileged and exempt from disclosure under applicable law. If the =
reader<br>
of this message is not the intended recipient, you are hereby notified =
that<br>
any printing, copying, dissemination, distribution, disclosure or<br>
forwarding of this communication is strictly prohibited. If you have<br>
received this communication in error, please contact the sender =
immediately<br>
and delete it from your system. Thank You.<br>
=
</font></span></blockquote></div></div></div></div></blockquote></div></di=
v></blockquote></div><br><div apple-content-edited=3D"true">
<div><div><div><span class=3D"Apple-style-span" style=3D"border-collapse: =
separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: =
normal; font-variant: normal; font-weight: normal; letter-spacing: =
normal; line-height: normal; orphans: 2; text-align: -webkit-auto; =
text-indent: 0px; text-transform: none; white-space: normal; widows: 2; =
word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span =
class=3D"Apple-style-span" style=3D"border-collapse: separate; color: =
rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: =
normal; font-weight: normal; letter-spacing: normal; line-height: =
normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; =
text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><div>-----------------------------------------</div><div>Chris =
Schneider</div><div><a =
href=3D"http://www.scaleunlimited.com/">http://www.scaleunlimited.com</a><=
br>custom big data =
solutions</div><div><div>-----------------------------------------</div></=
div></div><div><br></div></div></span></span></div></div></div></div></bod=
y></html>=

--Apple-Mail=_FB0F8C0F-C794-4D44-9C6D-08439D4A9F13--