incubator-oozie-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Islam <misla...@yahoo.com>
Subject Re: Issues in oozie 3.0.0
Date Tue, 18 Oct 2011 20:59:17 GMT
Hi Giri,
This user-list is correct.
I replied your email in the same day asking some more information.
Pasting the reply here again.
Regards,
Mohammad

Hi Giri,
Sorry for the inconvenience.

Hadoop should throw the exception and , as a resukt, oozie should put the job in error state.

Could you please share your log file that happened just after the job submission? The log
you posted looks little weird. Does the system has a lot of active jobs?

You could use http://pastebin.com/ to exchange the large log file.

Regards,
Mohammad





________________________________
From: Giri Prasad Reddy <d.giriprasad@gmail.com>
To: oozie-users@incubator.apache.org
Sent: Tuesday, October 18, 2011 10:11 AM
Subject: Re: Issues in oozie 3.0.0

Is this the right alias to pose this question?
Do I need to send to oozie-dev?

Can some one who can debug oozie help me in this regard.

thanks,
giri

On Mon, Oct 17, 2011 at 12:02 PM, Giri Prasad Reddy
<d.giriprasad@gmail.com> wrote:
> Hi,
>
> I am using oozie 3.0.0. While running map reduce example, by mistake I
> gave incorrect job tracker ip address in
> 'examples/apps/map-reduce/job.properties'.
> And when I run the job, it is started successfully (info says 'running').
>
> But when I see the oozie.log, it has the following messages
> ----
> 2011-10-13 02:25:40,528 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
> StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
> 2011-10-13 02:25:40,528 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
> 2011-10-13 02:25:45,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
> StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
> 2011-10-13 02:25:45,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
> ----
>
> And there thousands of the following messages
>
> 2011-10-13 02:26:35,651 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
> StartXCommand@5513dd59], timed out [5,000]ms, and requeue itself [action.start]
> 2011-10-13 02:26:35,651 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
> 2011-10-13 02:26:40,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Could not get lock [org.apache.oozie.command.wf.Action
> StartXCommand@36fb2f8], timed out [5,000]ms, and requeue itself [action.start]
> 2011-10-13 02:26:40,539 DEBUG ActionStartXCommand:531 - USER[-] GROUP[-] TOKEN[-
> ] APP[-] JOB[-] ACTION[-] Queuing [1] commands with delay [10]ms
>
> And I am not able to kill the job as well.
> I looked at the code little bit, if jobtracker is not valid then
> 'ActionStartXCommand' should receive an exception. But it is not
> getting that.
>
> Is it a known issue.
>
> thanks,
> giri
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message