Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of rajkrrsingh@gmail.com
 designates 209.85.216.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAPhMYXKTLu3BgLpKGgTawy+R8UyXE28xBnfQDjxY1W56H=zYuA@mail.gmail.com>
References: 
 <CAPhMYXKTLu3BgLpKGgTawy+R8UyXE28xBnfQDjxY1W56H=zYuA@mail.gmail.com>
From: Raj K Singh <rajkrrsingh@gmail.com>
Date: Wed, 7 Jan 2015 13:39:07 +0530
Message-ID: 
 <CADcmSaqZDaRm7mW2VbqOvZ2AwoaQOnuFyRqOEmXyGRb+XPyiVw@mail.gmail.com>
Subject: Re: Write and Read file through map reduce
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a113340dadc7e33050c0b71ce

--001a113340dadc7e33050c0b71ce
Content-Type: text/plain; charset=UTF-8

you can configure your third mapreduce job using MultipleFileInput and read
those file into you job. if the file size is small then you can consider
the DistributedCache which will give you an optimal performance if you are
joining the datasets of file1 and file2. I will also recommend you to use
some job scheduling api oozie to make sure that thrid job kicks off only
when the file1 and file2 are available on the HDFS( the same can be done by
some shell script or JobControl implementation).

::::::::::::::::::::::::::::::::::::::::
Raj K Singh
http://in.linkedin.com/in/rajkrrsingh
http://www.rajkrrsingh.blogspot.com
Mobile  Tel: +91 (0)9899821370

On Tue, Jan 6, 2015 at 2:25 AM, hitarth trivedi <t.hitarth@gmail.com> wrote:

> Hi,
>
> I have 6 node cluster, and the scenario is as follows :-
>
> I have one map reduce job which will write file1 in HDFS.
> I have another map reduce job which will write file2 in  HDFS.
> In the third map reduce job I need to use file1 and file2 to do some
> computation and output the value.
>
> What is the best way to store file1 and file2 in HDFS so that they could
> be used in third map reduce job.
>
> Thanks,
> Hitarth
>

--001a113340dadc7e33050c0b71ce
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:trebuche=
t ms,sans-serif">you can configure your third mapreduce job using MultipleF=
ileInput and read those file into you job. if the file size is small then y=
ou can consider the DistributedCache which will give you an optimal perform=
ance if you are joining the datasets of file1 and file2. I will also recomm=
end you to use some job scheduling api oozie to make sure that thrid job ki=
cks off only when the file1 and file2 are available on the HDFS( the same c=
an be done by some shell script or JobControl implementation).</div></div><=
div class=3D"gmail_extra"><br clear=3D"all"><div><div class=3D"gmail_signat=
ure"><div dir=3D"ltr"><font face=3D"trebuchet ms, sans-serif">:::::::::::::=
:::::::::::::::::::::::::::<br>Raj K Singh</font><div><font face=3D"trebuch=
et ms, sans-serif"><a href=3D"http://in.linkedin.com/in/rajkrrsingh" target=
=3D"_blank">http://in.linkedin.com/in/rajkrrsingh</a><br></font><div><font =
face=3D"trebuchet ms, sans-serif"><a href=3D"http://www.rajkrrsingh.blogspo=
t.com" target=3D"_blank">http://www.rajkrrsingh.blogspot.com</a><br>Mobile=
=C2=A0 Tel: +91 (0)9899821370</font><br></div></div></div></div></div>
<br><div class=3D"gmail_quote">On Tue, Jan 6, 2015 at 2:25 AM, hitarth triv=
edi <span dir=3D"ltr">&lt;<a href=3D"mailto:t.hitarth@gmail.com" target=3D"=
_blank">t.hitarth@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"g=
mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex"><div dir=3D"ltr">Hi,<div><br></div><div>I have 6 node cluster, and=
 the scenario is as follows :-</div><div><br></div><div>I have one map redu=
ce job which will write file1 in HDFS.</div><div>I have another map reduce =
job which will write file2 in =C2=A0HDFS.</div><div>In the third map reduce=
 job I need to use file1 and file2 to do some computation and output the va=
lue.</div><div><br></div><div>What is the best way to store file1 and file2=
 in HDFS so that they could be used in third map reduce job.</div><div><br>=
</div><div>Thanks,</div><div>Hitarth</div></div>
</blockquote></div><br></div>

--001a113340dadc7e33050c0b71ce--