Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 80A0E102C3 for ; Wed, 10 Jul 2013 08:34:54 +0000 (UTC) Received: (qmail 1152 invoked by uid 500); 10 Jul 2013 08:34:48 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 792 invoked by uid 500); 10 Jul 2013 08:34:48 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 785 invoked by uid 99); 10 Jul 2013 08:34:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Jul 2013 08:34:47 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_NONE X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [65.55.111.105] (HELO blu0-omc2-s30.blu0.hotmail.com) (65.55.111.105) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Jul 2013 08:34:38 +0000 Received: from BLU0-SMTP181 ([65.55.111.71]) by blu0-omc2-s30.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 10 Jul 2013 01:33:57 -0700 X-EIP: [GVAQlHrltWZEJFbkEqusY6HDc0tWasw3] X-Originating-Email: [francis.hu@reachjunction.com] Message-ID: Received: from testPC ([119.129.103.50]) by BLU0-SMTP181.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Wed, 10 Jul 2013 01:33:53 -0700 From: Francis.Hu To: Subject: cannot submit a job via java client in hadoop- 2.0.5-alpha Date: Wed, 10 Jul 2013 16:33:46 +0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0040_01CE7D8B.430FFB10" X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ac59SDBsOXX/BrO4SSOuum7nAJXqbA== Content-Language: en-us x-cr-hashedpuzzle: BPau CCjq CnmH DQPD EJgk Ehkp Ej+c EtpP GL+e GUQ5 GrTh JyCS Ki2u L3W1 MLrP MO+q;1;dQBzAGUAcgBAAGgAYQBkAG8AbwBwAC4AYQBwAGEAYwBoAGUALgBvAHIAZwA=;Sosha1_v1;7;{613C69EC-735B-4A31-B861-C76A0A4BD5F0};ZgByAGEAbgBjAGkAcwAuAGgAdQBAAHIAZQBhAGMAaABqAHUAbgBjAHQAaQBvAG4ALgBjAG8AbQA=;Wed, 10 Jul 2013 08:33:44 GMT;YwBhAG4AbgBvAHQAIABzAHUAYgBtAGkAdAAgAGEAIABqAG8AYgAgAHYAaQBhACAAagBhAHYAYQAgAGMAbABpAGUAbgB0ACAAaQBuACAAaABhAGQAbwBvAHAALQAgADIALgAwAC4ANQAtAGEAbABwAGgAYQA= x-cr-puzzleid: {613C69EC-735B-4A31-B861-C76A0A4BD5F0} X-OriginalArrivalTime: 10 Jul 2013 08:33:54.0091 (UTC) FILETIME=[363567B0:01CE7D48] X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_0040_01CE7D8B.430FFB10 Content-Type: multipart/alternative; boundary="----=_NextPart_001_0041_01CE7D8B.430FFB10" ------=_NextPart_001_0041_01CE7D8B.430FFB10 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi,All I have a hadoop- 2.0.5-alpha cluster with 3 data nodes . I have Resource Manager and all data nodes started and can access web ui of Resource Manager. I wrote a java client to submit a job as TestJob class below. But the job is never submitted successfully. It throws out exception all the time. My configurations are attached. Can anyone help me? Thanks. ---------my-java client public class TestJob { public void execute() { Configuration conf1 = new Configuration(); conf1.addResource("resources/core-site.xml"); conf1.addResource("resources/hdfs-site.xml"); conf1.addResource("resources/yarn-site.xml"); conf1.addResource("resources/mapred-site.xml"); JobConf conf = new JobConf(conf1); conf.setJar("/home/francis/hadoop-jobs/MapReduceJob.jar"); conf.setJobName("Test"); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(DisplayRequestMapper.class); conf.setReducerClass(DisplayRequestReducer.class); FileInputFormat.setInputPaths(conf,new Path("/home/francis/hadoop-jobs/2013070907.FNODE.2.txt")); FileOutputFormat.setOutputPath(conf, new Path("/home/francis/hadoop-jobs/result/")); try { JobClient client = new JobClient(conf); RunningJob job = client.submitJob(conf); job.waitForCompletion(); } catch (IOException e) { e.printStackTrace(); } } } ----------Exception jvm 1 | java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. jvm 1 | at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:119) jvm 1 | at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:81) jvm 1 | at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:74) jvm 1 | at org.apache.hadoop.mapred.JobClient.init(JobClient.java:482) jvm 1 | at org.apache.hadoop.mapred.JobClient.(JobClient.java:461) jvm 1 | at com.rh.elastic.hadoop.job.TestJob.execute(TestJob.java:59) Thanks, Francis.Hu ------=_NextPart_001_0041_01CE7D8B.430FFB10 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,All

 

I have a hadoop- 2.0.5-alpha = cluster with 3 data nodes . I have Resource Manager and all data nodes started and can = access web ui of Resource Manager.

I wrote a java client to submit = a job as TestJob class below. But the job is never submitted successfully. It throws out = exception all the time.

My configurations are attached. =  Can anyone help me? Thanks.

 

---------my-java = client

public = class TestJob = {

  &= nbsp;

  &= nbsp; public = void= execute() {

 <= /p>

  &= nbsp;     Configuration conf1 =3D new Configuration();

  &= nbsp;     conf1.addResource("resou= rces/core-site.xml");

  &= nbsp;     conf1.addResource("resou= rces/hdfs-site.xml");

  &= nbsp;     conf1.addResource("resou= rces/yarn-site.xml");

  &= nbsp;     conf1.addResource("resou= rces/mapred-site.xml");

  &= nbsp;     JobConf conf =3D new JobConf(conf1);

  &= nbsp;    

  &= nbsp;     conf.setJar("/home= /francis/hadoop-jobs/MapReduceJob.jar");

  &= nbsp;     conf.setJobName("Test&= quot;);

 <= /p>

  &= nbsp;     = conf.setInputFormat(TextInputFormat.class);

  &= nbsp;     = conf.setOutputFormat(TextOutputFormat.class);

 <= /p>

  &= nbsp;     = conf.setOutputKeyClass(Text.class);

  &= nbsp;     = conf.setOutputValueClass(IntWritable.class);

 <= /p>

  &= nbsp;     = conf.setMapperClass(DisplayRequestMapper.class);

  &= nbsp;     = conf.setReducerClass(DisplayRequestReducer.class);

 <= /p>

  &= nbsp;     = FileInputFormat.setInputPaths(conf,new<= /b> = Path("/home= /francis/hadoop-jobs/2013070907.FNODE.2.txt"));

  &= nbsp;     = FileOutputFormat.setOutputPath(conf, new Path("/home= /francis/hadoop-jobs/result/"));

 <= /p>

  &= nbsp;     try<= /b> = {

  &= nbsp;         JobClient client =3D new JobClient(conf);

  &= nbsp;         RunningJob job =3D client.submitJob(conf);

  &= nbsp;         = job.waitForCompletion();

  &= nbsp;     } catch (IOException e) {

  &= nbsp;         = e.printStackTrace();

  &= nbsp;     }

    }

}

 

----------Exception =

 

jvm 1    | java.io.IOException: Cannot initialize Cluster. Please check your = configuration for mapreduce.framework.name and the correspond server = addresses.

jvm 1    |      at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:119)

jvm 1    |      at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:81)

jvm 1    |      at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:74)

jvm 1    |      at org.apache.hadoop.mapred.JobClient.init(JobClient.java:482)

jvm 1    |      at = org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:461)<= /o:p>

jvm 1    |      at com.rh.elastic.hadoop.job.TestJob.execute(TestJob.java:59)

 

 

Thanks,

Francis.Hu

 

------=_NextPart_001_0041_01CE7D8B.430FFB10-- ------=_NextPart_000_0040_01CE7D8B.430FFB10 Content-Type: text/xml; name="yarn-site.xml" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="yarn-site.xml" =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= yarn.acl.enable=0A= false=0A= Enable ACLs? Defaults to false.=0A= =0A= =0A= =0A= =0A= =0A= =0A= yarn.resourcemanager.address=0A= 192.168.219.129:9001=0A= ResourceManager host:port for clients to submit = jobs.=0A= =0A= =0A= yarn.resourcemanager.scheduler.address=0A= 192.168.219.129:8030=0A= ResourceManager host:port for ApplicationMasters to talk = to Scheduler to obtain resources.=0A= =0A= =0A= yarn.resourcemanager.resource-tracker.address=0A= 192.168.219.129:8031=0A= ResourceManager host:port for NodeManagers.=0A= =0A= =0A= yarn.resourcemanager.admin.address=0A= 192.168.219.129:8033=0A= ResourceManager host:port for administrative = commands.=0A= =0A= =0A= yarn.resourcemanager.webapp.address=0A= 192.168.219.129:8088=0A= ResourceManager web-ui host:port.=0A= =0A= =0A= yarn.resourcemanager.scheduler.class=0A= = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.C= apacityScheduler=0A= ResourceManager Scheduler class.=0A= =0A= =0A= yarn.scheduler.minimum-allocation-mb=0A= 1024=0A= Minimum limit of memory to allocate to each container = request at the Resource Manager.=0A= =0A= =0A= yarn.scheduler.maximum-allocation-mb=0A= 8192=0A= Maximum limit of memory to allocate to each container = request at the Resource Manager.=0A= =0A= =0A= =0A= =0A= =0A= yarn.nodemanager.resource.memory-mb=0A= 8192=0A= Resource i.e. available physical memory, in MB, for given = NodeManager.Defines total available resources on the NodeManager to be = made available to running containers=0A= =0A= =0A= yarn.nodemanager.vmem-pmem-ratio=0A= 2.1=0A= Maximum ratio by which virtual memory usage of tasks may = exceed physical memory=0A= =0A= =0A= yarn.nodemanager.local-dirs=0A= /home/francis/hadoop2-hdfs/yarn=0A= Comma-separated list of paths on the local filesystem = where intermediate data is written.Multiple paths help spread disk = i/o.=0A= =0A= =0A= yarn.nodemanager.log-dirs=0A= /home/francis/hadoop2-hdfs/yarn-log=0A= Comma-separated list of paths on the local filesystem = where logs are written.Multiple paths help spread disk i/o.=0A= =0A= =0A= yarn.nodemanager.log.retain-seconds=0A= 10800=0A= Default time (in seconds) to retain log files on the = NodeManager Only applicable if log-aggregation is disabled.=0A= =0A= =0A= yarn.nodemanager.remote-app-log-dir=0A= /logs=0A= HDFS directory where the application logs are moved on = application completion. Need to set appropriate permissions. Only = applicable if log-aggregation is enabled.=0A= =0A= =0A= yarn.nodemanager.remote-app-log-dir-suffix=0A= logs=0A= Suffix appended to the remote log dir. Logs will be = aggregated to = ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only = applicable if log-aggregation is enabled.=0A= =0A= =0A= yarn.nodemanager.aux-services=0A= mapreduce.shuffle=0A= Shuffle service that needs to be set for Map Reduce = applications.=0A= =0A= =0A= =0A= =0A= yarn.log-aggregation.retain-seconds=0A= -1=0A= How long to keep aggregation logs before deleting them. -1 = disables. Be careful, set this too small and you will spam the name = node.=0A= =0A= =0A= yarn.log-aggregation.retain-check-interval-seconds=0A= -1=0A= Time between checks for aggregated log retention. If set = to 0 or a negative value then the value is computed as one-tenth of the = aggregated log retention time. Be careful, set this too small and you = will spam the name node.=0A= =0A= =0A= yarn.log-aggregation.retain-seconds=0A= -1=0A= How long to keep aggregation logs before deleting them. -1 = disables. Be careful, set this too small and you will spam the name = node.=0A= =0A= =0A= =0A= ------=_NextPart_000_0040_01CE7D8B.430FFB10 Content-Type: text/xml; name="mapred-site.xml" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="mapred-site.xml" =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= mapreduce.framework.name=0A= yarn=0A= Execution framework set to Hadoop YARN.=0A= =0A= =0A= =0A= mapreduce.map.memory.mb=0A= 1536=0A= Larger resource limit for maps.=0A= =0A= =0A= =0A= mapreduce.map.java.opts=0A= -Xmx1024M=0A= Larger heap-size for child jvms of maps.=0A= =0A= =0A= mapreduce.reduce.memory.mb=0A= 3072=0A= Larger resource limit for reduces.=0A= =0A= =0A= mapreduce.reduce.java.opts=0A= -Xmx2560M=0A= Larger heap-size for child jvms of reduces.=0A= =0A= =0A= mapreduce.task.io.sort.mb=0A= 512=0A= Higher memory-limit while sorting data for = efficiency.=0A= =0A= =0A= mapreduce.task.io.sort.factor=0A= 100=0A= More streams merged at once while sorting = files.=0A= =0A= =0A= mapreduce.reduce.shuffle.parallelcopies=0A= 50=0A= Higher number of parallel copies run by reduces to fetch = outputs from very large number of maps.=0A= =0A= =0A= =0A= =0A= mapreduce.jobhistory.address=0A= 192.168.219.129:10020=0A= MapReduce JobHistory Server host:port,Default port is = 10020.=0A= =0A= =0A= mapreduce.jobhistory.webapp.address=0A= 192.168.219.129:19888=0A= MapReduce JobHistory Server Web UI host:port,Default port = is 19888.=0A= =0A= =0A= mapreduce.jobhistory.intermediate-done-dir=0A= /mr-history/tmp=0A= Directory where history files are written by MapReduce = jobs.=0A= =0A= =0A= mapreduce.jobhistory.done-dir=0A= /mr-history/done=0A= Directory where history files are managed by the MR = JobHistory Server.=0A= =0A= =0A= =0A= ------=_NextPart_000_0040_01CE7D8B.430FFB10 Content-Type: text/xml; name="core-site.xml" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="core-site.xml" =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= fs.defaultFS=0A= hdfs://RhCluster=0A= =0A= =0A= =0A= io.file.buffer.size=0A= 4096=0A= Size of read/write buffer used in = SequenceFiles.=0A= =0A= =0A= =0A= hadoop.tmp.dir =0A= /home/francis/hadoop2-hdfs/tmp =0A= A base for other temporary directories.=0A= =0A= =0A= =0A= =0A= ha.zookeeper.quorum=0A= = 192.168.219.129:2181,192.168.219.130:2181,192.168.219.132:2181=0A= =0A= =0A= =0A= ------=_NextPart_000_0040_01CE7D8B.430FFB10 Content-Type: text/xml; name="hdfs-site.xml" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="hdfs-site.xml" =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= dfs.namenode.name.dir =0A= /home/francis/hadoop2-hdfs/name =0A= Path on the local filesystem where the NameNode stores = the namespace and transactions logs persistently. =0A= =0A= =0A= =0A= dfs.datanode.data.dir =0A= /home/francis/hadoop2-hdfs/data =0A= Comma separated list of paths on the local filesystem of = a DataNode where it should store its blocks. =0A= =0A= =0A= =0A= dfs.blocksize =0A= 67108864 =0A= HDFS blocksize of 268435456(256MB) for large = file-systems. =0A= =0A= =0A= =0A= dfs.namenode.handler.count =0A= 10 =0A= More NameNode server threads to handle RPCs from large = number of DataNodes. =0A= =0A= =0A= =0A= =0A= =0A= dfs.replication=0A= 2=0A= =0A= =0A= dfs.permissions=0A= false=0A= =0A= =0A= =0A= dfs.hosts=0A= /home/francis/hadoop-2.0.5-alpha/etc/hadoop/slaves=0A= =0A= =0A= =0A= =0A= =0A= =0A= dfs.support.append=0A= true=0A= =0A= =0A= =0A= dfs.client.block.write.replace-datanode-on-failure.enable=0A= true=0A= NOTE:this is cannot be disabled if you need to do APPEND = on a file=0A= =0A= =0A= =0A= dfs.client.block.write.replace-datanode-on-failure.policy=0A= DEFAULT=0A= NEVER: never add a new datanode.When the cluster size is = extremely small, e.g. 3 nodes or less, cluster administrators may want = to set the policy to NEVER in the default configuration file or disable = this feature=0A= =0A= =0A= =0A= =0A= =0A= =0A= =0A= dfs.nameservices=0A= RhCluster=0A= the logical name for this new nameservice=0A= =0A= =0A= dfs.ha.namenodes.RhCluster=0A= nn1,nn2=0A= =0A= =0A= dfs.namenode.rpc-address.RhCluster.nn1=0A= 192.168.219.129:8020=0A= =0A= =0A= dfs.namenode.rpc-address.RhCluster.nn2=0A= 192.168.219.132:8020=0A= =0A= =0A= dfs.namenode.http-address.RhCluster.nn1=0A= 192.168.219.129:50070=0A= =0A= =0A= dfs.namenode.http-address.RhCluster.nn2=0A= 192.168.219.132:50070=0A= =0A= =0A= dfs.namenode.shared.edits.dir=0A= = qjournal://192.168.219.129:8485;192.168.219.132:8485;192.168.219.1= 30:8485/RhCluster=0A= =0A= =0A= dfs.client.failover.proxy.provider.RhCluster=0A= = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyP= rovider=0A= =0A= =0A= =0A= dfs.ha.fencing.methods=0A= =0A= sshfence=0A= =0A= =0A= =0A= dfs.ha.fencing.ssh.connect-timeout=0A= 30000=0A= =0A= =0A= =0A= dfs.ha.fencing.ssh.private-key-files=0A= /home/francis/.ssh/id_rsa=0A= =0A= =0A= =0A= dfs.journalnode.edits.dir=0A= /home/francis/hadoop2-hdfs/journalnode/data=0A= =0A= =0A= =0A= =0A= dfs.ha.automatic-failover.enabled.RhCluster=0A= true=0A= =0A= =0A= =0A= =0A= dfs.datanode.balance.bandwidthPerSec=0A= 10485760=0A= 10M per seconds when transfering data for = balancing=0A= =0A= =0A= =0A= =0A= dfs.namenode.avoid.read.stale.datanode=0A= true=0A= stale data node=0A= =0A= =0A= dfs.namenode.avoid.write.stale.datanode=0A= true=0A= stale data node=0A= =0A= =0A= dfs.namenode.stale.datanode.interval=0A= 30000=0A= in milliseconds=0A= =0A= =0A= dfs.namenode.write.stale.datanode.ratio=0A= 0.5f=0A= in percentage=0A= =0A= =0A= =0A= ------=_NextPart_000_0040_01CE7D8B.430FFB10--