hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Pados (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
Date Tue, 17 Dec 2013 10:08:07 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Attila Pados updated MAPREDUCE-5655:
------------------------------------

    Description: 
I was trying to run a java class on my client, windows 7 developer environment, which submits
a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results
back to the local machine.

General use case is to use hadoop services from a web application installed on a non-cluster
computer, or as part of a developer environment.

The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was
generated with wrong CLASSPATH entry. Together with the java process call on the bottom of
the file, these entries were generated in windows style, using % as shell variable marker
and ; as the CLASSPATH delimiter.

I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes
create these entries, and is passed forward to the ApplicationMaster, assuming that the OS
that runs these classes will match the one running the ApplicationMaster. But it's not the
case, these are in 2 different jvm, and also the OS can be different, the strings are generated
based on the client/submitter side's OS.

I made some workaround changes to these 2 files, so i could launch my job, however there may
be more problems ahead.

update
 error message:
13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED
due to: Application application_1386170530016_0001 failed 2 times due to AM Container for
appattempt_1386170530016_0001_000002 exited with  exitCode: 1 due to: Exception from container-launch:

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
	at org.apache.hadoop.util.Shell.run(Shell.java:379)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)

update2: 
 It also reqires to add the following property to 
 mapred-site.xml (or mapred-default.xml), on the windows box, so that the job launcher knows,
that the job runner will be a linux:

  <property>
  <name>mapred.remote.os</name>
  <value>Linux</value>
  <description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
 </property

  was:
I was trying to run a java class on my client, windows 7 developer environment, which submits
a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads the results
back to the local machine.

General use case is to use hadoop services from a web application installed on a non-cluster
computer, or as part of a developer environment.

The problem was, that the ApplicationMaster's startup shell script (launch_container.sh) was
generated with wrong CLASSPATH entry. Together with the java process call on the bottom of
the file, these entries were generated in windows style, using % as shell variable marker
and ; as the CLASSPATH delimiter.

I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java classes
create these entries, and is passed forward to the ApplicationMaster, assuming that the OS
that runs these classes will match the one running the ApplicationMaster. But it's not the
case, these are in 2 different jvm, and also the OS can be different, the strings are generated
based on the client/submitter side's OS.

I made some workaround changes to these 2 files, so i could launch my job, however there may
be more problems ahead.

update
 error message:
13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED
due to: Application application_1386170530016_0001 failed 2 times due to AM Container for
appattempt_1386170530016_0001_000002 exited with  exitCode: 1 due to: Exception from container-launch:

org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
	at org.apache.hadoop.util.Shell.run(Shell.java:379)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:724)


> Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath
> -------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5655
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client, job submission
>    Affects Versions: 2.2.0
>         Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any linux)
>            Reporter: Attila Pados
>         Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer environment, which
submits a job to the remote Hadoop cluster, initiates a mapreduce there, and then downloads
the results back to the local machine.
> General use case is to use hadoop services from a web application installed on a non-cluster
computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script (launch_container.sh)
was generated with wrong CLASSPATH entry. Together with the java process call on the bottom
of the file, these entries were generated in windows style, using % as shell variable marker
and ; as the CLASSPATH delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the YarnRunner.java
classes create these entries, and is passed forward to the ApplicationMaster, assuming that
the OS that runs these classes will match the one running the ApplicationMaster. But it's
not the case, these are in 2 different jvm, and also the OS can be different, the strings
are generated based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, however there
may be more problems ahead.
> update
>  error message:
> 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with state FAILED
due to: Application application_1386170530016_0001 failed 2 times due to AM Container for
appattempt_1386170530016_0001_000002 exited with  exitCode: 1 due to: Exception from container-launch:

> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> 	at org.apache.hadoop.util.Shell.run(Shell.java:379)
> 	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> 	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> 	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
> 	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:724)
> update2: 
>  It also reqires to add the following property to 
>  mapred-site.xml (or mapred-default.xml), on the windows box, so that the job launcher
knows, that the job runner will be a linux:
>   <property>
>   <name>mapred.remote.os</name>
>   <value>Linux</value>
>   <description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
>  </property



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message