hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim Chew <kchew...@gmail.com>
Subject Submit a Hadoop 1.1.1 job remotely to a Hadoop 2 cluster
Date Wed, 16 Apr 2014 21:27:43 GMT
I have a cluster running Hadoop 2 but it is not running YARN, i.e. "
mapreduce.framework.name" is set to "classic" therefore the ResourceManager
is not running.

On the Client side, I want to submit a job compiled with Hadoop-1.1.1 to
the above cluster. Here how my Hadoop-1.1.1 mapred-site.xml looks like,

<property>
        <!-- Pointed to the remote JobTracker -->
        <name>mapred.job.tracker</name>
        <value>172.31.3.150:8021</value>
  </property>

Not surprisingly I got a version mismatched when I submit my job using the
Hadoop-1.1.1 jars,

org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot
communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1107)

So I recompiled my job with Hadoop 2 and submitted it using the Hadoop 2
jars. Here is how my Hadoop 2 mapred-site.xml looks like,

<property>
    <!-- Pointed to the remote JobTracker -->
        <name>mapreduce.job.tracker.address</name>
        <value>172.31.3.150:8021</value>
    </property>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

Note that I have to set "mapreduce.framework.name" to "yarn" otherwise the
job will be run locally instead of on the targeted cluster. But my targeted
cluster is not running YARN as stated above,

14/04/16 13:35:47 INFO client.RMProxy: Connecting to ResourceManager at /
172.31.3.150:8032
14/04/16 13:35:49 INFO ipc.Client: Retrying connect to server:
hadoop-host1.eng.narus.com/172.31.3.150:8032. Already tried 0 time(s);
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)

(Yes I have set my "yarn.resourcemanager.hostname" to "172.31.3.150" in
yarn-site.xml on my client.)

Therefore it seems to me that it does not matter I have to recompile my job
with Hadoop 2 or not. The question is what should I do to enable submitting
my job remotely to the Hadoop 2 cluster ? What are the configurations I
need to set on the client side?

The only solution I can think of is to enable YARN on the Hadoop 2 cluster
but is it necessary?

I am running out of pointers and stuck 8-(

TIA

Kim

Mime
View raw message