incubator-oozie-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Islam <misla...@yahoo.com>
Subject Re: Issues in oozie 3.0.0
Date Mon, 17 Oct 2011 06:47:54 GMT
Hi Giri,
Sorry for the inconvenience.

Hadoop should throw the exception and , as a resukt, oozie should put the job in error state.

Could you please share your log file that happened just after the job submission? The log
you posted looks little weird. Does the system has a lot of active jobs?

You could use http://pastebin.com/ to exchange the large log file.

Regards,
Mohammad





________________________________
From: Giri Prasad Reddy <d.giriprasad@gmail.com>
To: oozie-users@incubator.apache.org
Sent: Sunday, October 16, 2011 11:32 PM
Subject: Issues in oozie 3.0.0

Hi,

I am using oozie 3.0.0. While running map reduce example, by mistake I
gave incorrect job tracker ip address in
'examples/apps/map-reduce/job.properties'.
And when I run the job, it is started successfully (info says 'running').

But when I see the oozie.log, it has the following messages
----
2011-10-13 02:25:40,528 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
2011-10-13 02:25:40,528 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
2011-10-13 02:25:45,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
2011-10-13 02:25:45,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
----

And there thousands of the following messages

2011-10-13 02:26:35,651 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
2011-10-13 02:26:35,651 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
2011-10-13 02:26:40,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
StartXCommand@36fb2f8], timed out [5,000]ms, and requeue itself [action.start]
2011-10-13 02:26:40,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms

And I am not able to kill the job as well.
I looked at the code little bit, if jobtracker is not valid then
'ActionStartXCommand' should receive an exception. But it is not
getting that.

Is it a known issue.

thanks,
giri
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message