who can tell me ,how to unscript this maillist....On Mon, Dec 24, 2012 at 12:08 AM, Marcin Mejran <firstname.lastname@example.org> wrote:
Yeah, oozie sounds like the best approach. I think “timeout” in Oozie refers to something different (stopping a coordinator if it hasn’t started within X minutes) but the SLA mechanism should do what’s asked for.
Also, I think that Oozie allows for timeouts in job submission. That might answer your need.
On Sat, Dec 22, 2012 at 2:08 PM, Ted Dunning <email@example.com> wrote:
You can write a script to parse the Hadoop job list and send an alert.
The trick of putting a retry into your workflow system is a nice one. If your program won't allow multiple copies to run at the same time, then if you re-invoke the program every, say, hour, then 5 retries implies that the previous invocation has been running for 5 hours.
On Sat, Dec 22, 2012 at 12:49 PM, Mohit Anchlia <firstname.lastname@example.org> wrote:
On Sat, Dec 22, 2012 at 12:44 PM, Mohammad Tariq <email@example.com> wrote:
MR web UI?Although we can't trigger anything, it provides all the info related to the jobs. I mean it would be easier to just go there and and have a look at everything rather than opening the shell and typing the command.
I'm a bit lazy ;)
On Sun, Dec 23, 2012 at 2:09 AM, Mohit Anchlia <firstname.lastname@example.org> wrote:
Best I can find is hadoop job list so far
On Sat, Dec 22, 2012 at 12:30 PM, Mohit Anchlia <email@example.com> wrote:
What's the best way to trigger alert when jobs run for too long or have many failures? Is there a hadoop command that can be used to perform this activity?