Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <21303798.1173950109480.JavaMail.jira@brutus>
Date: Thu, 15 Mar 2007 02:15:09 -0700 (PDT)
From: "Tom White (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-1111) Job completion notification to a
 job configured URL
In-Reply-To: <4859887.1173692049584.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12481086 ] 

Tom White commented on HADOOP-1111:
-----------------------------------

In HDFS, by design NameNodes never initiate connections. However, I'm not sure the same design principle applies to JobTracker. For a start, the MapReduce paper says (section 3.1, point 7)

  "When all map tasks and reduce tasks have been completed, the master wakes up the user program."

which implies the master initiates a connection. What do others think?

However, perhaps we should question the use of HTTP - would IPC and a callback interface not be both more consistent with the rest of MapReduce in Hadoop, and a nicer API for the user?

> Job completion notification to a job configured URL
> ---------------------------------------------------
>
>                 Key: HADOOP-1111
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1111
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.12.0
>         Environment: all
>            Reporter: Alejandro Abdelnur
>         Attachments: patch-1111.txt, patch-1111.txt, patch-1111.txt
>
>
> Currently clients have to poll the JobTracker to find if a job has completed or not.
> When invoking Hadoop from other systems is desirable to have a notification mechanism on job completion. 
> The notification approach simplifies the client waiting for completion and removes load from the JobTracker as polling can be avoided. 
> Proposed solution:
> When the JobTracker processes the completion of a job (success and failure)  if the job configuration has a jobEnd.notificationUrl property it will make a HTTP GET request to the specified URL.
> The jobEnd.notificationUrl property may include 2 variables in it '${jobId}' and '${jobStatus}'. if they are present, they will be replaced with tehe job ID and status of the job and the URL will be invoked.
> Two additional properties, 'jobEnd.retries' and 'jobEnd.retryInterval', will indicate retry behavior.
> Not to delay the JobTracker processing while doing notifications, a ConsumerProducer Queue will be used to queue up job notification upon completion.
> A daemon thread will consume job notifications from the above Queue and will make the URL invocation. 
> On notification failure, the job notification is  queue up again on the notification queue.
> The queue will be a java.util.concurrent.DelayQueue. This will make job notifications (on retries) to be avaiable on the consumer side only when the retry time is up.
> The changes will be done in the JobTracker and in the LocalJobRunner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.