Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of java8964@hotmail.com
 designates 65.55.111.86 as permitted sender)
Message-ID: <BLU162-W7F0D37C93922DC29BC97BD04E0@phx.gbl>
Content-Type: multipart/alternative;
	boundary="_e075a4df-ba02-4351-a48d-fbee3b583b1b_"
From: java8964 java8964 <java8964@hotmail.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: running map tasks in remote node
Date: Fri, 23 Aug 2013 10:11:14 -0400
Importance: Normal
In-Reply-To: 
 <CANXCz3Q0Dkyyt=LKEeJXUypkyws7foLAqTSHkxitKi587yFQfA@mail.gmail.com>
References: 
 <CANXCz3SPqFG1z4Amt-9+EVDXVLPGD9Lcp6XmkJP9vzpeU=SQQw@mail.gmail.com>,<BLU162-W150F039777AFD6C330F1D6D04D0@phx.gbl>,<CANXCz3Q0Dkyyt=LKEeJXUypkyws7foLAqTSHkxitKi587yFQfA@mail.gmail.com>
MIME-Version: 1.0

--_e075a4df-ba02-4351-a48d-fbee3b583b1b_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

It is possible to do what you are trying to do=2C but only make sense if yo=
ur MR job is very CPU intensive=2C and you want to use the CPU resource in =
your cluster=2C instead of the IO.
You may want to do some research about what is the HDFS's role in Hadoop. F=
irst but not least=2C it provides a central storage for all the files will =
be processed by MR jobs. If you don't want to use HDFS=2C so you need to  i=
dentify a share storage to be shared among all the nodes in your cluster. H=
DFS is NOT required=2C but a shared storage is required in the cluster.
For simply your question=2C let's just use NFS to replace HDFS. It is possi=
ble for a POC to help you understand how to set it up.
Assume your have a cluster with 3 nodes (one NN=2C two DN. The JT running o=
n NN=2C and TT running on DN). So instead of using HDFS=2C you can try to u=
se NFS by this way:
1) Mount /share_data in all of your 2 data nodes. They need to have the sam=
e mount. So /share_data in each data node point to the same NFS location. I=
t doesn't matter where you host this NFS share=2C but just make sure each d=
ata node mount it as the same /share_data2) Create a folder under /share_da=
ta=2C put all your data into that folder.3) When kick off your MR jobs=2C y=
ou need to give a full URL of the input path=2C like 'file:///shared_data/m=
yfolder'=2C also a full URL of the output path=2C like 'file:///shared_data=
/output'. In this way=2C each mapper will understand that in fact they will=
 run the data from local file system=2C instead of HDFS. That's the reason =
you want to make sure each task node has the same mount path=2C as 'file://=
/shared_data/myfolder' should work fine for each  task node. Check this and=
 make sure that /share_data/myfolder all point to the same path in each of =
your task node.4) You want each mapper to process one file=2C so instead of=
 using the default 'TextInputFormat'=2C use a 'WholeFileInputFormat'=2C thi=
s will make sure that every file under '/share_data/myfolder' won't be spli=
t and sent to the same mapper processor. 5) In the above set up=2C I don't =
think you need to start NameNode or DataNode process any more=2C anyway you=
 just use JobTracker and TaskTracker.6) Obviously when your data is big=2C =
the NFS share will be your bottleneck. So maybe you can replace it with Sha=
re Network Storage=2C but above set up gives you a start point.7) Keep in m=
ind when set up like above=2C you lost the Data Replication=2C Data Localit=
y etc=2C that's why I said it ONLY makes sense if your MR job is CPU intens=
ive. You simple want to use the Mapper/Reducer tasks to process your data=
=2C instead of any scalability of IO.
Make sense?
Yong

Date: Fri=2C 23 Aug 2013 15:43:38 +0530
Subject: Re: running map tasks in remote node
From: rabmdu@gmail.com
To: user@hadoop.apache.org

Thanks for the reply.=20
I am basically exploring possible ways to work with hadoop framework for on=
e of my use case. I have my limitations in using hdfs but agree with the fa=
ct that using map reduce in conjunction with hdfs makes sense.  =0A=

I successfully tested wholeFileInputFormat by some googling.=20
Now=2C coming to my use case. I would like to keep some files in my master =
node and want to do some processing in the cloud nodes. The policy does not=
 allow us to configure and use cloud nodes as HDFS.  However=2C I would lik=
e to span a map process in those nodes. Hence=2C I set input path as local =
file system=2C for example=2C $HOME/inputs. I have a file listing filenames=
 (10 lines) in this input directory.  I use NLineInputFormat and span 10 ma=
p process. Each map process gets a line. The map process will then do a fil=
e transfer and process it.  However=2C I get an error in the map saying tha=
t the FileNotFoundException $HOME/inputs. I am sure this directory is prese=
nt in my master but not in the slave nodes. When I copy this input director=
y to slave nodes=2C it works fine. I am not able to figure out how to fix t=
his and the reason for the error. I am not understand why it complains abou=
t the input directory is not present. As far as I know=2C slave nodes get a=
 map and map method contains contents of the input file. This should be fin=
e for the map logic to work.=0A=


with regardsrabmdu


On Thu=2C Aug 22=2C 2013 at 4:40 PM=2C java8964 java8964 <java8964@hotmail.=
com> wrote:
=0A=
=0A=
=0A=
=0A=
If you don't plan to use HDFS=2C what kind of sharing file system you are g=
oing to use between cluster? NFS?For what you want to do=2C even though it =
doesn't make too much sense=2C but you need to the first problem as the sha=
red file system.=0A=

Second=2C if you want to process the files file by file=2C instead of block=
 by block in HDFS=2C then you need to use the WholeFileInputFormat (google =
this how to write one). So you don't need a file to list all the files to b=
e processed=2C just put them into one folder in the sharing file system=2C =
then send this folder to your MR job. In this way=2C as long as each node c=
an access it through some file system URL=2C each file will be processed in=
 each mapper.=0A=

Yong

Date: Wed=2C 21 Aug 2013 17:39:10 +0530
Subject: running map tasks in remote node
From: rabmdu@gmail.com
To: user@hadoop.apache.org=0A=


Hello=2C =0A=
=0A=
Here is the new bie question of the day. For one of my use cases=2C I want =
to use hadoop map reduce without HDFS. Here=2C I will have a text file cont=
aining a list of file names to process. Assume that I have 10 lines (10 fil=
es to process) in the input text file and I wish to generate 10 map tasks a=
nd execute them in parallel in 10 nodes. I started with basic tutorial on h=
adoop and could setup single node hadoop cluster and successfully tested wo=
rdcount code.=0A=
=0A=
 Now=2C I took two machines A (master) and B (slave). I did the below confi=
guration in these machines to setup a two node cluster.=0A=
=0A=
 hdfs-site.xml=0A=
=0A=
 <?xml version=3D"1.0"?>=0A=
=0A=
<?xml-stylesheet type=3D"text/xsl" href=3D"configuration.xsl"?><!-- Put sit=
e-specific property overrides in this file. -->=0A=
=0A=
<configuration><property>=0A=
=0A=
          <name>dfs.replication</name>          <value>1</value>=0A=
=0A=
</property><property>=0A=
=0A=
  <name>dfs.name.dir</name>  <value>/tmp/hadoop-bala/dfs/name</value>=0A=
=0A=
</property><property>=0A=
=0A=
  <name>dfs.data.dir</name>  <value>/tmp/hadoop-bala/dfs/data</value>=0A=
=0A=
</property><property>=0A=
=0A=
     <name>mapred.job.tracker</name>    <value>A:9001</value>=0A=
=0A=
</property> =0A=
=0A=
</configuration> mapred-site.xml=0A=
=0A=
 <?xml version=3D"1.0"?>=0A=
=0A=
<?xml-stylesheet type=3D"text/xsl" href=3D"configuration.xsl"?> =0A=
=0A=
<!-- Put site-specific property overrides in this file. --> =0A=
=0A=
<configuration><property>=0A=
=0A=
            <name>mapred.job.tracker</name>            <value>A:9001</value=
>=0A=
=0A=
</property><property>=0A=
=0A=
          <name>mapreduce.tasktracker.map.tasks.maximum</name>=0A=
           <value>1</value>=0A=
</property></configuration>=0A=
=0A=
 core-site.xml =0A=
=0A=
<?xml version=3D"1.0"?><?xml-stylesheet type=3D"text/xsl" href=3D"configura=
tion.xsl"?>=0A=
=0A=
<!-- Put site-specific property overrides in this file. --><configuration>=
=0A=
=0A=
         <property>                <name>fs.default.name</name>=0A=
=0A=
                <value>hdfs://A:9000</value>        </property>=0A=
=0A=
</configuration> =0A=
=0A=
 In A and B=2C I do have a file named =91slaves=92 with an entry =91B=92 in=
 it and another file called =91masters=92 wherein an entry =91A=92 is there=
.=0A=
=0A=
 I have kept my input file at A. I see the map method process the input fil=
e line by line but they are all processed in A. Ideally=2C I would expect t=
hose processing to take place in B.=0A=
=0A=
 Can anyone highlight where I am going wrong?=0A=
=0A=
  regardsrab=0A=
 		 	   		  =0A=

 		 	   		  =

--_e075a4df-ba02-4351-a48d-fbee3b583b1b_
Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<style><!--
.hmmessage P
{
margin:0px=3B
padding:0px
}
body.hmmessage
{
font-size: 12pt=3B
font-family:Calibri
}
--></style></head>
<body class=3D'hmmessage'><div dir=3D'ltr'>It is possible to do what you ar=
e trying to do=2C but only make sense if your MR job is very CPU intensive=
=2C and you want to use the CPU resource in your cluster=2C instead of the =
IO.<div><br></div><div>You may want to do some research about what is the H=
DFS's role in Hadoop. First but not least=2C it provides a central storage =
for all the files will be processed by MR jobs. If you don't want to use HD=
FS=2C so you need to &nbsp=3Bidentify a share storage to be shared among al=
l the nodes in your cluster. HDFS is NOT required=2C but a shared storage i=
s required in the cluster.</div><div><br></div><div>For simply your questio=
n=2C let's just use NFS to replace HDFS. It is possible for a POC to help y=
ou understand how to set it up.</div><div><br></div><div>Assume your have a=
 cluster with 3 nodes (one NN=2C two DN. The JT running on NN=2C and TT run=
ning on DN). So instead of using HDFS=2C you can try to use NFS by this way=
:</div><div><br></div><div>1) Mount /share_data in all of your 2 data nodes=
. They need to have the same mount. So /share_data in each data node point =
to the same NFS location. It doesn't matter where you host this NFS share=
=2C but just make sure each data node mount it as the same /share_data</div=
><div>2) Create a folder under /share_data=2C put all your data into that f=
older.</div><div>3) When kick off your MR jobs=2C you need to give a full U=
RL of the input path=2C like 'file:///shared_data/myfolder'=2C also a full =
URL of the output path=2C like 'file:///shared_data/output'. In this way=2C=
 each mapper will understand that in fact they will run the data from local=
 file system=2C instead of HDFS. That's the reason you want to make sure ea=
ch task node has the same mount path=2C as 'file:///shared_data/myfolder' s=
hould work fine for each &nbsp=3Btask node. Check this and make sure that /=
share_data/myfolder all point to the same path in each of your task node.</=
div><div>4) You want each mapper to process one file=2C so instead of using=
 the default 'TextInputFormat'=2C use a 'WholeFileInputFormat'=2C this will=
 make sure that every file under '/share_data/myfolder' won't be split and =
sent to the same mapper processor.&nbsp=3B</div><div>5) In the above set up=
=2C I don't think you need to start NameNode or DataNode process any more=
=2C anyway you just use JobTracker and TaskTracker.</div><div>6) Obviously =
when your data is big=2C the NFS share will be your bottleneck. So maybe yo=
u can replace it with Share Network Storage=2C but above set up gives you a=
 start point.</div><div>7) Keep in mind when set up like above=2C you lost =
the Data Replication=2C Data Locality etc=2C that's why I said it ONLY make=
s sense if your MR job is CPU intensive. You simple want to use the Mapper/=
Reducer tasks to process your data=2C instead of any scalability of IO.</di=
v><div><br></div><div>Make sense?</div><div><br></div><div>Yong<br><br><div=
><hr id=3D"stopSpelling">Date: Fri=2C 23 Aug 2013 15:43:38 +0530<br>Subject=
: Re: running map tasks in remote node<br>From: rabmdu@gmail.com<br>To: use=
r@hadoop.apache.org<br><br><div dir=3D"ltr">Thanks for the reply.&nbsp=3B<d=
iv><br></div><div>I am basically exploring possible ways to work with hadoo=
p framework for one of my use case. I have my limitations in using hdfs but=
 agree with the fact that using map reduce in conjunction with hdfs makes s=
ense. &nbsp=3B</div>=0A=
<div><br></div><div>I successfully tested wholeFileInputFormat by some goog=
ling.&nbsp=3B</div><div><br></div><div>Now=2C coming to my use case. I woul=
d like to keep some files in my master node and want to do some processing =
in the cloud nodes. The policy does not allow us to configure and use cloud=
 nodes as HDFS. &nbsp=3BHowever=2C I would like to span a map process in th=
ose nodes. Hence=2C I set input path as local file system=2C for example=2C=
 $HOME/inputs. I have a file listing filenames (10 lines) in this input dir=
ectory. &nbsp=3BI use NLineInputFormat and span 10 map process. Each map pr=
ocess gets a line. The map process will then do a file transfer and process=
 it. &nbsp=3BHowever=2C I get an error in the map saying that the FileNotFo=
undException $HOME/inputs. I am sure this directory is present in my master=
 but not in the slave nodes. When I copy this input directory to slave node=
s=2C it works fine. I am not able to figure out how to fix this and the rea=
son for the error. I am not understand why it complains about the input dir=
ectory is not present. As far as I know=2C slave nodes get a map and map me=
thod contains contents of the input file. This should be fine for the map l=
ogic to work.</div>=0A=
<div><br></div><div><br></div><div>with regards</div><div>rabmdu</div><div>=
<br></div><div><br></div></div><div class=3D"ecxgmail_extra"><br><br><div c=
lass=3D"ecxgmail_quote">On Thu=2C Aug 22=2C 2013 at 4:40 PM=2C java8964 jav=
a8964 <span dir=3D"ltr">&lt=3B<a href=3D"mailto:java8964@hotmail.com" targe=
t=3D"_blank">java8964@hotmail.com</a>&gt=3B</span> wrote:<br>=0A=
<blockquote class=3D"ecxgmail_quote" style=3D"border-left:1px #ccc solid=3B=
padding-left:1ex=3B">=0A=
=0A=
=0A=
<div><div dir=3D"ltr">If you don't plan to use HDFS=2C what kind of sharing=
 file system you are going to use between cluster? NFS?<div>For what you wa=
nt to do=2C even though it doesn't make too much sense=2C but you need to t=
he first problem as the shared file system.</div>=0A=
<div><br></div><div>Second=2C if you want to process the files file by file=
=2C instead of block by block in HDFS=2C then you need to use the WholeFile=
InputFormat (google this how to write one). So you don't need a file to lis=
t all the files to be processed=2C just put them into one folder in the sha=
ring file system=2C then send this folder to your MR job. In this way=2C as=
 long as each node can access it through some file system URL=2C each file =
will be processed in each mapper.</div>=0A=
<div><br></div><div>Yong<br><br><div><hr>Date: Wed=2C 21 Aug 2013 17:39:10 =
+0530<br>Subject: running map tasks in remote node<br>From: <a href=3D"mail=
to:rabmdu@gmail.com" target=3D"_blank">rabmdu@gmail.com</a><br>To: <a href=
=3D"mailto:user@hadoop.apache.org" target=3D"_blank">user@hadoop.apache.org=
</a><div>=0A=
<div class=3D"h5"><br><br><div dir=3D"ltr"><div style=3D"font-family:Calibr=
i=2Csans-serif=3B"><font face=3D"Arial=2C sans-serif">Hello=2C</font></div>=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div>=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B">=0A=
<font face=3D"Arial=2C sans-serif">Here is the new bie question of the day.=
</font></div><div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=
=3D"Arial=2C sans-serif">&nbsp=3B</font></div><div style=3D"font-family:Cal=
ibri=2Csans-serif=3B"><font face=3D"Arial=2C sans-serif">For one of my use =
cases=2C I want to use hadoop map reduce without HDFS. Here=2C I will have =
a text file containing a list of file names to process. Assume that I have =
10 lines (10 files to process) in the input text file and I wish to generat=
e 10 map tasks and execute them in parallel in 10 nodes. I started with bas=
ic tutorial on hadoop and could setup single node hadoop cluster and succes=
sfully tested wordcount code.</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div><div style=3D"font-family:Calibri=2Csans-se=
rif=3B"><font face=3D"Arial=2C sans-serif">Now=2C I took two machines A (ma=
ster) and B (slave). I did the below configuration in these machines to set=
up a two node cluster.</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div><div style=3D"font-family:Calibri=2Csans-se=
rif=3B"><font face=3D"Arial=2C sans-serif">hdfs-site.xml</font></div><div s=
tyle=3D"font-family:Calibri=2Csans-serif=3B">=0A=
=0A=
<font face=3D"Arial=2C sans-serif">&nbsp=3B</font></div><div style=3D"font-=
family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace" siz=
e=3D"1">&lt=3B?xml version=3D"1.0"?&gt=3B</font></div><div style=3D"font-fa=
mily:Calibri=2Csans-serif=3B">=0A=
=0A=
<font face=3D"Courier New=2C monospace" size=3D"1">&lt=3B?xml-stylesheet ty=
pe=3D"text/xsl" href=3D"configuration.xsl"?&gt=3B</font></div><div style=3D=
"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospac=
e" size=3D"1">&lt=3B!-- Put site-specific property overrides in this file. =
--&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3Bconfiguration&gt=3B</font></div><div style=
=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monos=
pace" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bname&gt=3Bdfs.replication&lt=3B/name&gt=3B</=
font></div><div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D=
"Courier New=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=
=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bvalue&gt=3B1&lt=3B/value&gt=3B</f=
ont></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B &lt=3Bname&gt=3Bdfs.name.dir&lt=3B/name=
&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans-serif=3B"><font=
 face=3D"Courier New=2C monospace" size=3D"1">&nbsp=3B &lt=3Bvalue&gt=3B/tm=
p/hadoop-bala/dfs/name&lt=3B/value&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B &lt=3Bname&gt=3Bdfs.data.dir&lt=3B/name=
&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans-serif=3B"><font=
 face=3D"Courier New=2C monospace" size=3D"1">&nbsp=3B &lt=3Bvalue&gt=3B/tm=
p/hadoop-bala/dfs/data&lt=3B/value&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bname&gt=
=3Bmapred.job.tracker&lt=3B/name&gt=3B</font></div><div style=3D"font-famil=
y:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace" size=3D"=
1">&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bvalue&gt=3BA:9001&lt=3B/value&gt=3B</font=
></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&nbsp=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/configuration&gt=3B</font></div><div styl=
e=3D"font-family:Calibri=2Csans-serif=3B">&nbsp=3B</div><div style=3D"font-=
family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C sans-serif">mapred-s=
ite.xml</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B">&nbsp=3B</div><div style=
=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monos=
pace" size=3D"1">&lt=3B?xml version=3D"1.0"?&gt=3B</font></div><div style=
=3D"font-family:Calibri=2Csans-serif=3B">=0A=
=0A=
<font face=3D"Courier New=2C monospace" size=3D"1">&lt=3B?xml-stylesheet ty=
pe=3D"text/xsl" href=3D"configuration.xsl"?&gt=3B</font></div><div style=3D=
"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospac=
e" size=3D"1">&nbsp=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B!-- Put site-specific property overrides i=
n this file. --&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans-=
serif=3B"><font face=3D"Courier New=2C monospace" size=3D"1">&nbsp=3B</font=
></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3Bconfiguration&gt=3B</font></div><div style=
=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monos=
pace" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bname&gt=3Bmapred.job.tracker=
&lt=3B/name&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans-seri=
f=3B"><font face=3D"Courier New=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&n=
bsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=
=3Bvalue&gt=3BA:9001&lt=3B/value&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&lt=3Bproperty&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bname&gt=3Bmapreduce.tasktracker.map.tasks.ma=
ximum&lt=3B/name&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans=
-serif=3B">=0A=
<font face=3D"Courier New=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bvalue&gt=3B1=
&lt=3B/value&gt=3B</font></div>=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/property&gt=3B</font></div><div style=3D"=
font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace=
" size=3D"1">&lt=3B/configuration&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B">&nbsp=3B</div><div style=
=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C sans-serif"=
>core-site.xml</font></div><div style=3D"font-family:Calibri=2Csans-serif=
=3B"><font face=3D"Arial=2C sans-serif">&nbsp=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B?xml version=3D"1.0"?&gt=3B</font></div><d=
iv style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=
=2C monospace" size=3D"1">&lt=3B?xml-stylesheet type=3D"text/xsl" href=3D"c=
onfiguration.xsl"?&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B!-- Put site-specific property overrides i=
n this file. --&gt=3B</font></div><div style=3D"font-family:Calibri=2Csans-=
serif=3B"><font face=3D"Courier New=2C monospace" size=3D"1">&lt=3Bconfigur=
ation&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B &lt=3Bproperty&gt=3B</font></div><div style=3D"font-family=
:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace" size=3D"1=
">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&=
nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3Bname&gt=3B<a href=3D"=
http://fs.default.name/" target=3D"_blank">fs.default.name</a>&lt=3B/name&g=
t=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B=
&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &l=
t=3Bvalue&gt=3Bhdfs://A:9000&lt=3B/value&gt=3B</font></div><div style=3D"fo=
nt-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C monospace" =
size=3D"1">&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B&nbsp=3B &lt=3B/=
property&gt=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&lt=3B/configuration&gt=3B</font></div><div styl=
e=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier New=2C mono=
space" size=3D"1">&nbsp=3B</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Courier Ne=
w=2C monospace" size=3D"1">&nbsp=3B</font></div><div style=3D"font-family:C=
alibri=2Csans-serif=3B"><font face=3D"Arial=2C sans-serif">In A and B=2C I =
do have a file named =91slaves=92 with an entry =91B=92 in it and another f=
ile called =91masters=92 wherein an entry =91A=92 is there.</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div><div style=3D"font-family:Calibri=2Csans-se=
rif=3B"><font face=3D"Arial=2C sans-serif">I have kept my input file at A. =
I see the map method process the input file line by line but they are all p=
rocessed in A. Ideally=2C I would expect those processing to take place in =
B.</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div><div style=3D"font-family:Calibri=2Csans-se=
rif=3B"><font face=3D"Arial=2C sans-serif">Can anyone highlight where I am =
going wrong?</font></div>=0A=
=0A=
<div style=3D"font-family:Calibri=2Csans-serif=3B"><font face=3D"Arial=2C s=
ans-serif">&nbsp=3B</font></div><div style=3D"font-family:Calibri=2Csans-se=
rif=3B">&nbsp=3Bregards</div><div style=3D"font-family:Calibri=2Csans-serif=
=3B">rab</div></div></div></div></div>=0A=
</div> 		 	   		  </div></div>=0A=
</blockquote></div><br></div></div></div> 		 	   		  </div></body>
</html>=

--_e075a4df-ba02-4351-a48d-fbee3b583b1b_--