Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 80EF5DF68 for ; Tue, 27 Nov 2012 12:04:17 +0000 (UTC) Received: (qmail 6485 invoked by uid 500); 27 Nov 2012 12:04:12 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 6311 invoked by uid 500); 27 Nov 2012 12:04:12 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 6292 invoked by uid 99); 27 Nov 2012 12:04:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2012 12:04:11 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zoltan.tothczifra@softonic.com designates 46.28.209.18 as permitted sender) Received: from [46.28.209.18] (HELO CAS02.domino.softonic.com) (46.28.209.18) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2012 12:04:03 +0000 Received: from CAS04.domino.softonic.com (172.20.4.207) by CAS02.domino.softonic.com (192.168.248.106) with Microsoft SMTP Server (TLS) id 14.2.309.2; Tue, 27 Nov 2012 13:03:42 +0100 Received: from MAILSVR01.domino.softonic.com ([169.254.1.186]) by CAS04.domino.softonic.com ([::1]) with mapi id 14.02.0309.002; Tue, 27 Nov 2012 13:03:42 +0100 From: =?iso-8859-1?Q?Zolt=E1n_T=F3th-Czifra?= To: "user@hadoop.apache.org" Subject: Complex MapReduce applications with the streaming API Thread-Topic: Complex MapReduce applications with the streaming API Thread-Index: AQHNzJadAfDcRTtiRE2P59Q97tTwUA== Date: Tue, 27 Nov 2012 12:03:41 +0000 Message-ID: <0005DA68C31EA7428873518FF86A94368E7E85@MAILSVR01.domino.softonic.com> Accept-Language: en-US, es-ES Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.168.248.105] Content-Type: multipart/alternative; boundary="_000_0005DA68C31EA7428873518FF86A94368E7E85MAILSVR01dominoso_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_0005DA68C31EA7428873518FF86A94368E7E85MAILSVR01dominoso_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi everyone, Thanks in advance for the support. My problem is the following: I'm trying to develop a fairly complex MapReduce application using the stre= aming API (for demonstation purposes, so unfortunately the "use Java" answe= r doesn't work :-( ). I can get one single MapReduce phase running from com= mand line with no problem. The problem is when I want to add more MapReduce= phases which use each others output, and I maybe even want to do a recursi= on (feed the its output to the same phase again) conditioned by a counter. The solution in Java MapReduce is trivial (i.e. creating multiple Job insta= nces and monitoring counters) but with the streaming API not quite. What is= the correct way to manage my application with its native code? (Python, PH= P, Perl...) Calling shell commands from a "controller" script? How should I= obtain counters?... Using Oozie seems to be an overkilling for this application, besides, it do= esn't support "loops" so the recusrsion can't really be implemented. Thanks a lot! Zoltan --_000_0005DA68C31EA7428873518FF86A94368E7E85MAILSVR01dominoso_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi everyone,

Thanks in advance for the support. My problem is the following:

I'm trying to develop a fairly complex MapReduce application using the= streaming API (for demonstation purposes, so unfortunately the "use J= ava" answer doesn't work :-( ). I can get one single MapReduce phase r= unning from command line with no problem. The problem is when I want to add more MapReduce phases which use each oth= ers output, and I maybe even want to do a recursion (feed the its output to= the same phase again) conditioned by a counter.

The solution in Java MapReduce is trivial (i.e. creating multiple = ;Job instances and monitoring counters) but with the streaming API not quit= e. What is the correct way to manage my application with its native code? (= Python, PHP, Perl...) Calling shell commands from a "controller" script? How should I obtain counters?...

Using Oozie seems to be an overkilling for this application, besides, = it doesn't support "loops" so the recusrsion can't really be impl= emented.

Thanks a lot!
Zoltan
--_000_0005DA68C31EA7428873518FF86A94368E7E85MAILSVR01dominoso_--