Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EAECD18352 for ; Fri, 9 Oct 2015 12:32:41 +0000 (UTC) Received: (qmail 14823 invoked by uid 500); 9 Oct 2015 12:32:36 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 14708 invoked by uid 500); 9 Oct 2015 12:32:36 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 14694 invoked by uid 99); 9 Oct 2015 12:32:35 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Oct 2015 12:32:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 6B8091A092B for ; Fri, 9 Oct 2015 12:32:35 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.219 X-Spam-Level: **** X-Spam-Status: No, score=4.219 tagged_above=-999 required=6.31 tests=[FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_REPLY=1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 2N4IDtWfocQY for ; Fri, 9 Oct 2015 12:32:34 +0000 (UTC) Received: from DUB004-OMC2S28.hotmail.com (dub004-omc2s28.hotmail.com [157.55.1.167]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 19DAC25426 for ; Fri, 9 Oct 2015 12:32:34 +0000 (UTC) Received: from DUB129-W92 ([157.55.1.137]) by DUB004-OMC2S28.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Fri, 9 Oct 2015 05:32:28 -0700 X-TMN: [EH7XgDbXnngyM3e9mXlR5yg/VLp88Xqt] X-Originating-Email: [danielschulz2005@hotmail.com] Message-ID: Content-Type: multipart/alternative; boundary="_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_" From: Daniel Schulz To: "user@hadoop.apache.org" Subject: RE: Job that just runs the reduce tasks Date: Fri, 9 Oct 2015 14:32:27 +0200 Importance: Normal In-Reply-To: <56178D09.5010806@gmail.com> References: <56178D09.5010806@gmail.com> MIME-Version: 1.0 X-OriginalArrivalTime: 09 Oct 2015 12:32:28.0083 (UTC) FILETIME=[8F280030:01D1028E] --_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi=2C Yes: this is possible. Just configure the 1st MR job's output path as the 2= nd ones inputs. There will be identity mappers running -- compared to no ma= ppers -- but they come with Hadoop. They are just a technical neccessity. To avoid this overhead=2C Tez=2C Spark=2C Flink and other execution engines= were build to write a DAG and run your algorithms on them. Kind regards=2C Daniel. > To: user@hadoop.apache.org > From: xeonmailinglist@gmail.com > Subject: Job that just runs the reduce tasks > Date: Fri=2C 9 Oct 2015 10:46:49 +0100 >=20 > Hi=2C >=20 > If we run a job without reduce tasks=2C the map output is going to be=20 > saved into HDFS. Now=2C I would like to launch another job that reads the= =20 > map output and compute the reduce phase. Is it possible to execute a job= =20 > that reads the map output from HDFS and just runs the reduce phase? >=20 > Thanks=2C >=20 = --_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi=2C

Yes: th= is is possible. Just configure the 1st MR job's output path as the 2nd ones= inputs. There will be identity mappers running -- compared to no mappers -= - but they come with Hadoop. They are just a technical neccessity.

To avoid this overhead=2C Tez=2C Spark=2C Flink and other = execution engines were build to write a DAG and run your algorithms on them= .

Kind regards=2C Daniel.

>=3B To: u= ser@hadoop.apache.org
>=3B From: xeonmailinglist@gmail.com
>=3B S= ubject: Job that just runs the reduce tasks
>=3B Date: Fri=2C 9 Oct 20= 15 10:46:49 +0100
>=3B
>=3B Hi=2C
>=3B
>=3B If we run= a job without reduce tasks=2C the map output is going to be
>=3B sav= ed into HDFS. Now=2C I would like to launch another job that reads the
= >=3B map output and compute the reduce phase. Is it possible to execute a= job
>=3B that reads the map output from HDFS and just runs the reduc= e phase?
>=3B
>=3B Thanks=2C
>=3B
= = --_07a6e09f-81b2-49ac-b780-d5ae3ccb85d4_--