Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EEC5F17995 for ; Mon, 27 Apr 2015 16:43:23 +0000 (UTC) Received: (qmail 63692 invoked by uid 500); 27 Apr 2015 16:43:19 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 63576 invoked by uid 500); 27 Apr 2015 16:43:19 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 63565 invoked by uid 99); 27 Apr 2015 16:43:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 16:43:18 +0000 X-ASF-Spam-Status: No, hits=0.9 required=5.0 tests=SPF_FAIL X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: encountered temporary error during SPF processing of domain of cnauroth@hortonworks.com) Received: from [54.76.25.247] (HELO mx1-eu-west.apache.org) (54.76.25.247) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 16:42:52 +0000 Received: from relayvx11a.securemail.intermedia.net (relayvx11a.securemail.intermedia.net [64.78.56.46]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 821DD2146A for ; Mon, 27 Apr 2015 16:42:28 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id EA92553E31 for ; Mon, 27 Apr 2015 09:42:20 -0700 (PDT) Subject: Re: How to call Hadoop job from a web service in a non-blocking fashion? MIME-Version: 1.0 x-echoworx-emg-received: Mon, 27 Apr 2015 09:42:20.944 -0700 x-echoworx-msg-id: f3437284-93c3-4c80-a57e-6e12212675ef x-echoworx-action: delivered Received: from emg-ca-1-1.securemail.intermedia.net ([10.254.155.11]) by emg-ca-1-1 (JAMES SMTP Server 2.3.2) with SMTP ID 601 for ; Mon, 27 Apr 2015 09:42:20 -0700 (PDT) Received: from MBX080-W4-CO-2.exch080.serverpod.net (unknown [10.224.117.102]) by emg-ca-1-1.localdomain (Postfix) with ESMTP id B9EA253E31 for ; Mon, 27 Apr 2015 09:42:20 -0700 (PDT) Received: from MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) by MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Mon, 27 Apr 2015 09:42:19 -0700 Received: from MBX080-W4-CO-2.exch080.serverpod.net ([10.224.117.102]) by mbx080-w4-co-2.exch080.serverpod.net ([10.224.117.102]) with mapi id 15.00.1044.021; Mon, 27 Apr 2015 09:42:19 -0700 From: Chris Nauroth To: user Thread-Topic: How to call Hadoop job from a web service in a non-blocking fashion? Thread-Index: AQHQgObOzeoi3PROt02EH6kapeKXZJ1hEHCA Date: Mon, 27 Apr 2015 16:42:19 +0000 Message-ID: References: <553E2D2C.7050702@nissatech.com> In-Reply-To: <553E2D2C.7050702@nissatech.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [50.248.208.113] x-source-routing-agent: Processed Content-Type: text/plain; charset="us-ascii" Content-ID: <6631B7786917114FA7D646B0BD0B5215@exch080.serverpod.net> Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hello Marko, Job#waitForCompletion is implemented as a polling loop around Job#isComplete and Job#isSuccessful. Both of those calls are non-blocking. http://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/mapreduce/Job.ht ml#isComplete() http://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/mapreduce/Job.ht ml#isSuccessful() (Technically they do block on I/O due to RPC calls to check job status, but the idea is that they don't block waiting for the entire job to complete the way waitForCompletion does.) Do you think you could build the solution that you need around using these 2 methods? I believe Oozie implements it in a similar way. Even though you aren't using Oozie, you might consider looking at the Oozie codebase for inspiration. I think the relevant class in Oozie would be JavaActionExecutor. Disclaimer: I really don't know Oozie very well. :-) --Chris Nauroth On 4/27/15, 5:35 AM, "Marko Dinic" wrote: >Hello, > >I have a sequence of jobs that depend on each other, output of one job >is input for the next one. Also, there is a loop in one part of the >sequence, containing two jobs executing in a row. > >Until now I was able to run this job by simply creating Job objects and >using waitForCompletition(true). In that way, I was forwarding output of >one job as input to next one. > >The problem is, waitForCompletition(true) will block the web service I'm >trying to use, so I need a way to run this sequence of dependent jobs, >but not to get stuck waiting for result of the whole sequence. So, I >want the next model - user uploads some files, starts the job and gets >the response that the job has been started. After the sequence has >finished user should be notified in some way. > >I wouldn't like to use Oozie, since this the jobs are more low-level >(it's actually an algorithm similar to those implemented in Mahout), and >I don't know if I may use JobControl, since there is a loop, and how to >do it. > >Any help would be highly appreciated. > >Regards, >Marko