Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Message-ID: <DUB129-W92EA18CF1352D7AD2102D2D8340@phx.gbl>
Content-Type: multipart/alternative;
	boundary="_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_"
From: Daniel Schulz <danielschulz2005@hotmail.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Subject: RE: Job that just runs the reduce tasks
Date: Fri, 9 Oct 2015 14:32:27 +0200
Importance: Normal
In-Reply-To: <56178D09.5010806@gmail.com>
References: <56178D09.5010806@gmail.com>
MIME-Version: 1.0

--_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi=2C
Yes: this is possible. Just configure the 1st MR job's output path as the 2=
nd ones inputs. There will be identity mappers running -- compared to no ma=
ppers -- but they come with Hadoop. They are just a technical neccessity.
To avoid this overhead=2C Tez=2C Spark=2C Flink and other execution engines=
 were build to write a DAG and run your algorithms on them.
Kind regards=2C Daniel.

> To: user@hadoop.apache.org
> From: xeonmailinglist@gmail.com
> Subject: Job that just runs the reduce tasks
> Date: Fri=2C 9 Oct 2015 10:46:49 +0100
>=20
> Hi=2C
>=20
> If we run a job without reduce tasks=2C the map output is going to be=20
> saved into HDFS. Now=2C I would like to launch another job that reads the=
=20
> map output and compute the reduce phase. Is it possible to execute a job=
=20
> that reads the map output from HDFS and just runs the reduce phase?
>=20
> Thanks=2C
>=20
 		 	   		  =

--_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<style><!--
.hmmessage P
{
margin:0px=3B
padding:0px
}
body.hmmessage
{
font-size: 12pt=3B
font-family:Calibri
}
--></style></head>
<body class=3D'hmmessage'><div dir=3D'ltr'>Hi=2C<div><br></div><div>Yes: th=
is is possible. Just configure the 1st MR job's output path as the 2nd ones=
 inputs. There will be identity mappers running -- compared to no mappers -=
- but they come with Hadoop. They are just a technical neccessity.</div><di=
v><br></div><div>To avoid this overhead=2C Tez=2C Spark=2C Flink and other =
execution engines were build to write a DAG and run your algorithms on them=
.</div><div><br></div><div>Kind regards=2C Daniel.<br><br><div>&gt=3B To: u=
ser@hadoop.apache.org<br>&gt=3B From: xeonmailinglist@gmail.com<br>&gt=3B S=
ubject: Job that just runs the reduce tasks<br>&gt=3B Date: Fri=2C 9 Oct 20=
15 10:46:49 +0100<br>&gt=3B <br>&gt=3B Hi=2C<br>&gt=3B <br>&gt=3B If we run=
 a job without reduce tasks=2C the map output is going to be <br>&gt=3B sav=
ed into HDFS. Now=2C I would like to launch another job that reads the <br>=
&gt=3B map output and compute the reduce phase. Is it possible to execute a=
 job <br>&gt=3B that reads the map output from HDFS and just runs the reduc=
e phase?<br>&gt=3B <br>&gt=3B Thanks=2C<br>&gt=3B <br></div></div> 		 	   	=
	  </div></body>
</html>=

--_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_--