hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1111) Job completion notification to a job configured URL
Date Wed, 14 Mar 2007 01:37:09 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Alejandro Abdelnur updated HADOOP-1111:

    Attachment: patch-1111.txt

Answering to David's comments:

On the rationale for using DelayQueue is that the element becomes available to the consumer
no when is added to the queue but when the delay time is over. For the consumer thread this
means that it does not have to peek on the head element of the queue to see if it is time
to process the element (the notification), if it is an element available it is time to process

On using GET get instead of POST, the idea was to make this notification as  lightweight as
possible for both the JobTracker and the receiver of the notification. It is just a ping,
if the receiver is interested in more data about the end job it can use the JobClient to gather
detail info about the run.

On the notification thread ending, if running is set to false (shutdown condition) then the
'Thread has ended unexpectedly' message is not print. Else it is.

On making 'running' volatile, yes, I've missed that one, will do. (change made in new patch).

On using a singleton instead static members, yes I agree with you. I'm just  following the
style of JobTracker (start, stop methods are static). Plus it was simpler, less changes to
the JobTracker (no instance variable, no getter method to expose it, etc). I think we should
consider (not now) adding hooks in key sections of the JobTracker (job start, job end, job
kill, job task start, job task end, etc) to enable adding this kind of logic without having
to modify the JobTracker and other classes.

Answering to Tom's comments:

On retries for Job Notification, at first I've thought about making it without retries. When
Ruchir came up with the implementation using DelayQueue I've thought it was simple enough
and it would provide significant value (notification robustness) if desired (default retries
is 0).

On using the HttpClient retry mechanism, same as the retry policies is meant to be synchronous.
In this case, while the notification retries would not tie up the JobTracker, as it is running
in a side thread, it would tie up other notifiations, to avoid we've introduced the usage
of the DelayQueue.

On the local runner notification, we could use the same mechanism, but then we would have
to put logic in place to wait for the notifications to be delivered before the local runner
ends, I've thought this would be simpler. The other possibility is not to have notifications
when using the local runner (as is local the executor is in process). IMO, the benefit of
having it in the local runner is that enables (simplifies) integration testing.

On the infinite loop in the local runner notification, yes, thanks for the catch the loop
condition should be 'while (notification.configureForRetiry())'. (change made to latest patch).

> Job completion notification to a job configured URL
> ---------------------------------------------------
>                 Key: HADOOP-1111
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1111
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.12.0
>         Environment: all
>            Reporter: Alejandro Abdelnur
>         Attachments: patch-1111.txt, patch-1111.txt, patch-1111.txt
> Currently clients have to poll the JobTracker to find if a job has completed or not.
> When invoking Hadoop from other systems is desirable to have a notification mechanism
on job completion. 
> The notification approach simplifies the client waiting for completion and removes load
from the JobTracker as polling can be avoided. 
> Proposed solution:
> When the JobTracker processes the completion of a job (success and failure)  if the job
configuration has a jobEnd.notificationUrl property it will make a HTTP GET request to the
specified URL.
> The jobEnd.notificationUrl property may include 2 variables in it '${jobId}' and '${jobStatus}'.
if they are present, they will be replaced with tehe job ID and status of the job and the
URL will be invoked.
> Two additional properties, 'jobEnd.retries' and 'jobEnd.retryInterval', will indicate
retry behavior.
> Not to delay the JobTracker processing while doing notifications, a ConsumerProducer
Queue will be used to queue up job notification upon completion.
> A daemon thread will consume job notifications from the above Queue and will make the
URL invocation. 
> On notification failure, the job notification is  queue up again on the notification
> The queue will be a java.util.concurrent.DelayQueue. This will make job notifications
(on retries) to be avaiable on the consumer side only when the retry time is up.
> The changes will be done in the JobTracker and in the LocalJobRunner.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message