Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: local policy)
From: Jeff Stuckman <stuckman@umd.edu>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: Site-specific dfs.client.local.interfaces setting not respected
 for Yarn MR container
Thread-Topic: Site-specific dfs.client.local.interfaces setting not
 respected for Yarn MR container
Thread-Index: Ac75TUeKgTRIdq30TJGVY2E4p2vnTAAtO2uAAApfRfA=
Date: Sun, 15 Dec 2013 21:01:42 +0000
Message-ID: <0981BC370720B14DAB7D7B0CC70B16CF68EC83F8@OITMX1001.AD.UMD.EDU>
References: <0981BC370720B14DAB7D7B0CC70B16CF68EC73EF@OITMX1001.AD.UMD.EDU>
 <52AE17C7.7010101@gmail.com>
In-Reply-To: <52AE17C7.7010101@gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative;
	boundary="_000_0981BC370720B14DAB7D7B0CC70B16CF68EC83F8OITMX1001ADUMDE_"
MIME-Version: 1.0

--_000_0981BC370720B14DAB7D7B0CC70B16CF68EC83F8OITMX1001ADUMDE_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Thanks for the response. I have the preferIPv4Stack option in hadoop-env.sh=
; however; this was not preventing the mapreduce container from enumerating=
 the IPv6 address of the interface.

Jeff

From: Chris Mawata [mailto:chris.mawata@gmail.com]
Sent: Sunday, December 15, 2013 3:58 PM
To: user@hadoop.apache.org
Subject: Re: Site-specific dfs.client.local.interfaces setting not respecte=
d for Yarn MR container

You might have better luck with an alternative approach to avoid having IPV=
6 which is to add to your hadoop-env.sh

HADOOP_OPTS=3D"$HADOOP_OPTS -Djava.net.preferIPv4Stack=3Dtrue


Chris


On 12/14/2013 11:38 PM, Jeff Stuckman wrote:

Hello,


I have set up a two-node Hadoop cluster on Ubuntu 12.04 running streaming j=
obs with Hadoop 2.2.0. I am having problems with running tasks on a NM whic=
h is on a different host than the RM, and I believe that this is happening =
because the NM host's dfs.client.local.interfaces property is not having an=
y effect.


I have two hosts set up as follows:

Host A (1.2.3.4):

NameNode

DataNode

ResourceManager

Job History Server


Host B (5.6.7.8):

DataNode

NodeManager


On each host, hdfs-site.xml was edited to change dfs.client.local.interface=
s from an interface name ("eth0") to the IPv4 address representing that hos=
t's interface ("1.2.3.4" or "5.6.7.8"). This is to prevent the HDFS client =
from randomly binding to the IPv6 side of the interface (it randomly swaps =
between the IP4 and IP6 addresses, due to the random bind IP selection in t=
he DFS client) which was causing other problems.


However, I am observing that the Yarn container on the NM appears to inheri=
t the property from the copy of hdfs-site.xml on the RM, rather than readin=
g it from the local configuration file. In other words, setting the dfs.cli=
ent.local.interfaces property in Host A's configuration file causes the Yar=
n containers on Host B to use same value of the property. This causes the m=
ap task to fail, as the container cannot establish a TCP connection to the =
HDFS. However, on Host B, other commands that access the HDFS (such as "had=
oop fs") do work, as they respect the local value of the property.


To illustrate with an example, I start a streaming job from the command lin=
e on Host A:


hadoop jar $HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming-2.2.0.jar -=
input hdfs://hosta/linesin/ -output hdfs://hosta/linesout -mapper /home/had=
oop/toRecords.pl -reducer /bin/cat


The NodeManager on Host B notes that there was an error starting the contai=
ner:


13/12/14 19:38:45 WARN nodemanager.DefaultContainerExecutor: Exception from=
 container-launch with container ID: container_1387067177654_0002_01_000001=
 and exit code: 1

org.apache.hadoop.util.Shell$ExitCodeException:

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)

        at org.apache.hadoop.util.Shell.run(Shell.java:379)

        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.=
java:589)

        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecut=
or.launchContainer(DefaultContainerExecutor.java:195)

        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launc=
her.ContainerLaunch.call(ContainerLaunch.java:283)

        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launc=
her.ContainerLaunch.call(ContainerLaunch.java:79)

        at java.util.concurrent.FutureTask.run(Unknown Source)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source=
)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Sourc=
e)

        at java.lang.Thread.run(Unknown Source)


On Host B, I open userlogs/application_1387067177654_0002/container_1387067=
177654_0002_01_000001/syslog and find the following messages (note the DEBU=
G-level messages which I manually enabled for the DFS client):


2013-12-14 19:38:32,439 DEBUG [main] org.apache.hadoop.hdfs.DFSClient: Usin=
g local interfaces [1.2.3.4] with addresses [/1.2.3.4:0]

<cut>

2013-12-14 19:38:33,085 DEBUG [main] org.apache.hadoop.hdfs.DFSClient: newI=
nfo =3D LocatedBlocks{

  fileLength=3D537

  underConstruction=3Dfalse

  blocks=3D[LocatedBlock{BP-1911846690-1.2.3.4-1386999495143:blk_1073742317=
_1493; getBlockSize()=3D537; corrupt=3Dfalse; offset=3D0; locs=3D[5.6.7.8:5=
0010, 1.2.3.4:50010]}]

  lastLocatedBlock=3DLocatedBlock{BP-1911846690-1.2.3.4-1386999495143:blk_1=
073742317_1493; getBlockSize()=3D537; corrupt=3Dfalse; offset=3D0; locs=3D[=
5.6.7.8:50010, 1.2.3.4:50010]}

  isLastBlockComplete=3Dtrue}

2013-12-14 19:38:33,088 DEBUG [main] org.apache.hadoop.hdfs.DFSClient: Conn=
ecting to datanode 5.6.7.8:50010

2013-12-14 19:38:33,090 DEBUG [main] org.apache.hadoop.hdfs.DFSClient: Usin=
g local interface /1.2.3.4:0

2013-12-14 19:38:33,095 WARN [main] org.apache.hadoop.hdfs.DFSClient: Faile=
d to connect to /5.6.7.8:50010 for block, add to deadNodes and continue. ja=
va.net.BindException: Cannot assign requested address


Note the failure to bind to 1.2.3.4, as the IP for Node B's local interface=
 is actually 5.6.7.8.


Note that when running other HDFS commands on Host B, Host B's setting for =
dfs.client.local.interfaces is respected. On host B:


hadoop@nodeb:~$ hadoop fs -ls hdfs://hosta/

13/12/14 19:45:10 DEBUG hdfs.DFSClient: Using local interfaces [5.6.7.8] wi=
th addresses [/5.6.7.8:0]

Found 3 items

drwxr-xr-x   - hadoop supergroup          0 2013-12-14 00:40 hdfs://hosta/l=
inesin

drwxr-xr-x   - hadoop supergroup          0 2013-12-14 02:01 hdfs://hosta/s=
ystem

drwx------   - hadoop supergroup          0 2013-12-14 10:31 hdfs://hosta/t=
mp


If I change dfs.client.local.interfaces on Host A to eth0 (without touching=
 the setting on Host B), the syslog mentioned above instead shows the follo=
wing:


2013-12-14 22:32:19,686 DEBUG [main] org.apache.hadoop.hdfs.DFSClient: Usin=
g local interfaces [eth0] with addresses [/<some IP6 address>:0,/5.6.7.8:0]


The job then successfully completes sometimes, but both Host A and Host B w=
ill then randomly alternate between the IP4 and IP6 side of their eth0 inte=
rfaces, which causes other issues. In other words, changing the dfs.client.=
local.interfaces setting on Host A to a named adapter caused the Yarn conta=
iner on Host B to bind to an identically named adapter.

Any ideas on how I can reconfigure the cluster so every container will try =
to bind to its own interface? I successfully worked around this issue by do=
ing a custom build of HDFS which hardcodes my IP address in the DFSClient, =
but I am looking for a better long-term solution.


Thanks,

Jeff


--_000_0981BC370720B14DAB7D7B0CC70B16CF68EC83F8OITMX1001ADUMDE_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-micr=
osoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D"http:=
//www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:"Malgun Gothic";
	panose-1:2 11 5 3 2 0 0 2 0 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Consolas;
	panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
	{font-family:"\@Malgun Gothic";
	panose-1:2 11 5 3 2 0 0 2 0 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";
	color:black;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:#0563C1;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:#954F72;
	text-decoration:underline;}
p.MsoPlainText, li.MsoPlainText, div.MsoPlainText
	{mso-style-priority:99;
	mso-style-link:"Plain Text Char";
	margin:0in;
	margin-bottom:.0001pt;
	font-size:11.0pt;
	font-family:"Calibri","sans-serif";
	color:black;}
pre
	{mso-style-priority:99;
	mso-style-link:"HTML Preformatted Char";
	margin:0in;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";
	color:black;}
span.HTMLPreformattedChar
	{mso-style-name:"HTML Preformatted Char";
	mso-style-priority:99;
	mso-style-link:"HTML Preformatted";
	font-family:Consolas;
	color:black;}
span.PlainTextChar
	{mso-style-name:"Plain Text Char";
	mso-style-priority:99;
	mso-style-link:"Plain Text";
	font-family:"Calibri","sans-serif";}
span.EmailStyle21
	{mso-style-type:personal;
	font-family:"Calibri","sans-serif";
	color:windowtext;}
span.EmailStyle22
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor=3D"white" lang=3D"EN-US" link=3D"#0563C1" vlink=3D"#954F72">
<div class=3D"WordSection1">
<p class=3D"MsoNormal"><span style=3D"color:#1F497D">Thanks for the respons=
e. I have the preferIPv4Stack option in hadoop-env.sh; however; this was no=
t preventing the mapreduce container from enumerating the IPv6 address of t=
he interface.<o:p></o:p></span></p>
<p class=3D"MsoNormal"><span style=3D"color:#1F497D"><o:p>&nbsp;</o:p></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"color:#1F497D">Jeff<o:p></o:p></span>=
</p>
<p class=3D"MsoNormal"><a name=3D"_MailEndCompose"><span style=3D"color:#1F=
497D"><o:p>&nbsp;</o:p></span></a></p>
<div>
<div style=3D"border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in =
0in 0in">
<p class=3D"MsoNormal"><b><span style=3D"color:windowtext">From:</span></b>=
<span style=3D"color:windowtext"> Chris Mawata [mailto:chris.mawata@gmail.c=
om]
<br>
<b>Sent:</b> Sunday, December 15, 2013 3:58 PM<br>
<b>To:</b> user@hadoop.apache.org<br>
<b>Subject:</b> Re: Site-specific dfs.client.local.interfaces setting not r=
espected for Yarn MR container<o:p></o:p></span></p>
</div>
</div>
<p class=3D"MsoNormal"><o:p>&nbsp;</o:p></p>
<div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt">You might have better=
 luck with an alternative approach to avoid having IPV6 which is to add to =
your hadoop-env.sh<span style=3D"font-size:12.0pt"><o:p></o:p></span></p>
<pre>HADOOP_OPTS=3D&quot;$HADOOP_OPTS -Djava.net.preferIPv4Stack=3Dtrue<o:p=
></o:p></pre>
<pre><o:p>&nbsp;</o:p></pre>
<pre>Chris<o:p></o:p></pre>
<pre><o:p>&nbsp;</o:p></pre>
<p class=3D"MsoNormal"><br>
<br>
On 12/14/2013 11:38 PM, Jeff Stuckman wrote:<o:p></o:p></p>
</div>
<blockquote style=3D"margin-top:5.0pt;margin-bottom:5.0pt">
<p class=3D"MsoPlainText">Hello,<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">I have set up a two-node Hadoop cluster on Ubuntu=
 12.04 running streaming jobs with Hadoop 2.2.0. I am having problems with =
running tasks on a NM which is on a different host than the RM, and I belie=
ve that this is happening because
 the NM host's dfs.client.local.interfaces property is not having any effec=
t.<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">I have two hosts set up as follows:<o:p></o:p></p=
>
<p class=3D"MsoPlainText">Host A (1.2.3.4):<o:p></o:p></p>
<p class=3D"MsoPlainText">NameNode<o:p></o:p></p>
<p class=3D"MsoPlainText">DataNode<o:p></o:p></p>
<p class=3D"MsoPlainText">ResourceManager<o:p></o:p></p>
<p class=3D"MsoPlainText">Job History Server<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">Host B (5.6.7.8):<o:p></o:p></p>
<p class=3D"MsoPlainText">DataNode<o:p></o:p></p>
<p class=3D"MsoPlainText">NodeManager<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">On each host, hdfs-site.xml was edited to change =
dfs.client.local.interfaces from an interface name (&quot;eth0&quot;) to th=
e IPv4 address representing that host's interface (&quot;1.2.3.4&quot; or &=
quot;5.6.7.8&quot;). This is to prevent the HDFS client from randomly
 binding to the IPv6 side of the interface (it randomly swaps between the I=
P4 and IP6 addresses, due to the random bind IP selection in the DFS client=
) which was causing other problems.<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">However, I am observing that the Yarn container o=
n the NM appears to inherit the property from the copy of hdfs-site.xml on =
the RM, rather than reading it from the local configuration file. In other =
words, setting the dfs.client.local.interfaces
 property in Host A's configuration file causes the Yarn containers on Host=
 B to use same value of the property. This causes the map task to fail, as =
the container cannot establish a TCP connection to the HDFS. However, on Ho=
st B, other commands that access
 the HDFS (such as &quot;hadoop fs&quot;) do work, as they respect the loca=
l value of the property.<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">To illustrate with an example, I start a streamin=
g job from the command line on Host A:<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">hadoop jar $HADOOP_HOME/share/hadoop/tools/lib/ha=
doop-streaming-2.2.0.jar -input hdfs://hosta/linesin/ -output hdfs://hosta/=
linesout -mapper /home/hadoop/toRecords.pl -reducer /bin/cat<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">The NodeManager on Host B notes that there was an=
 error starting the container:<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">13/12/14 19:38:45 WARN nodemanager.DefaultContain=
erExecutor: Exception from container-launch with container ID: container_13=
87067177654_0002_01_000001 and exit code: 1<o:p></o:p></p>
<p class=3D"MsoPlainText">org.apache.hadoop.util.Shell$ExitCodeException:<o=
:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.util.Shell.runCommand(Shell.java:464)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.util.Shell.run(Shell.java:379)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)<o:p>=
</o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchConta=
iner(DefaultContainerExecutor.java:195)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerL=
aunch.call(ContainerLaunch.java:283)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at org=
.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerL=
aunch.call(ContainerLaunch.java:79)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at jav=
a.util.concurrent.FutureTask.run(Unknown Source)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at jav=
a.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)<o:p></o:p></=
p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at jav=
a.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)<o:p></o:p><=
/p>
<p class=3D"MsoPlainText">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at jav=
a.lang.Thread.run(Unknown Source)<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">On Host B, I open userlogs/application_1387067177=
654_0002/container_1387067177654_0002_01_000001/syslog and find the followi=
ng messages (note the DEBUG-level messages which I manually enabled for the=
 DFS client):<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 19:38:32,439 DEBUG [main] org.apache.h=
adoop.hdfs.DFSClient: Using local interfaces [1.2.3.4] with addresses [/1.2=
.3.4:0]<o:p></o:p></p>
<p class=3D"MsoPlainText">&lt;cut&gt;<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 19:38:33,085 DEBUG [main] org.apache.h=
adoop.hdfs.DFSClient: newInfo =3D LocatedBlocks{<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp; fileLength=3D537<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp; underConstruction=3Dfalse<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp; blocks=3D[LocatedBlock{BP-1911846690-1.2.3=
.4-1386999495143:blk_1073742317_1493; getBlockSize()=3D537; corrupt=3Dfalse=
; offset=3D0; locs=3D[5.6.7.8:50010, 1.2.3.4:50010]}]<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp; lastLocatedBlock=3DLocatedBlock{BP-1911846=
690-1.2.3.4-1386999495143:blk_1073742317_1493; getBlockSize()=3D537; corrup=
t=3Dfalse; offset=3D0; locs=3D[5.6.7.8:50010, 1.2.3.4:50010]}<o:p></o:p></p=
>
<p class=3D"MsoPlainText">&nbsp; isLastBlockComplete=3Dtrue}<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 19:38:33,088 DEBUG [main] org.apache.h=
adoop.hdfs.DFSClient: Connecting to datanode 5.6.7.8:50010<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 19:38:33,090 DEBUG [main] org.apache.h=
adoop.hdfs.DFSClient: Using local interface /1.2.3.4:0<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 19:38:33,095 WARN [main] org.apache.ha=
doop.hdfs.DFSClient: Failed to connect to /5.6.7.8:50010 for block, add to =
deadNodes and continue. java.net.BindException: Cannot assign requested add=
ress<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">Note the failure to bind to 1.2.3.4, as the IP fo=
r Node B's local interface is actually 5.6.7.8.<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">Note that when running other HDFS commands on Hos=
t B, Host B's setting for dfs.client.local.interfaces is respected. On host=
 B:<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">hadoop@nodeb:~$ hadoop fs -ls hdfs://hosta/<o:p><=
/o:p></p>
<p class=3D"MsoPlainText">13/12/14 19:45:10 DEBUG hdfs.DFSClient: Using loc=
al interfaces [5.6.7.8] with addresses [/5.6.7.8:0]<o:p></o:p></p>
<p class=3D"MsoPlainText">Found 3 items<o:p></o:p></p>
<p class=3D"MsoPlainText">drwxr-xr-x&nbsp;&nbsp; - hadoop supergroup&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 2013-12-14 00:40 hdfs://h=
osta/linesin<o:p></o:p></p>
<p class=3D"MsoPlainText">drwxr-xr-x&nbsp;&nbsp; - hadoop supergroup&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 2013-12-14 02:01 hdfs://h=
osta/system<o:p></o:p></p>
<p class=3D"MsoPlainText">drwx------&nbsp;&nbsp; - hadoop supergroup&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 2013-12-14 10:31 hdfs://h=
osta/tmp<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">If I change dfs.client.local.interfaces on Host A=
 to eth0 (without touching the setting on Host B), the syslog mentioned abo=
ve instead shows the following:<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">2013-12-14 22:32:19,686 DEBUG [main] org.apache.h=
adoop.hdfs.DFSClient: Using local interfaces [eth0] with addresses [/&lt;so=
me IP6 address&gt;:0,/5.6.7.8:0]<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">The job then successfully completes sometimes, bu=
t both Host A and Host B will then randomly alternate between the IP4 and I=
P6 side of their eth0 interfaces, which causes other issues. In other words=
, changing the dfs.client.local.interfaces
 setting on Host A to a named adapter caused the Yarn container on Host B t=
o bind to an identically named adapter.<o:p></o:p></p>
<p class=3D"MsoPlainText">Any ideas on how I can reconfigure the cluster so=
 every container will try to bind to its own interface? I successfully work=
ed around this issue by doing a custom build of HDFS which hardcodes my IP =
address in the DFSClient, but I am
 looking for a better long-term solution.<o:p></o:p></p>
<p class=3D"MsoPlainText">&nbsp;<o:p></o:p></p>
<p class=3D"MsoPlainText">Thanks,<o:p></o:p></p>
<p class=3D"MsoPlainText">Jeff<o:p></o:p></p>
<p class=3D"MsoNormal">&nbsp;<o:p></o:p></p>
</blockquote>
<p class=3D"MsoNormal"><span style=3D"font-size:12.0pt;font-family:&quot;Ti=
mes New Roman&quot;,&quot;serif&quot;"><o:p>&nbsp;</o:p></span></p>
</div>
</body>
</html>

--_000_0981BC370720B14DAB7D7B0CC70B16CF68EC83F8OITMX1001ADUMDE_--