Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B32D104B4 for ; Tue, 30 Apr 2013 19:12:34 +0000 (UTC) Received: (qmail 1108 invoked by uid 500); 30 Apr 2013 19:12:28 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 1005 invoked by uid 500); 30 Apr 2013 19:12:28 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 997 invoked by uid 99); 30 Apr 2013 19:12:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Apr 2013 19:12:28 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rkevinburton@charter.net designates 216.33.127.80 as permitted sender) Received: from [216.33.127.80] (HELO mta11.charter.net) (216.33.127.80) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Apr 2013 19:12:24 +0000 Received: from imp11 ([10.20.200.11]) by mta11.charter.net (InterMail vM.8.01.05.09 201-2260-151-124-20120717) with ESMTP id <20130430191202.EXRN6169.mta11.charter.net@imp11>; Tue, 30 Apr 2013 15:12:02 -0400 Received: from KevinPC ([24.159.241.210]) by imp11 with smtp.charter.net id WKC21l0064Z4VX405KC2d5; Tue, 30 Apr 2013 15:12:02 -0400 X-Authority-Analysis: v=2.0 cv=dIr+A5lb c=1 sm=1 a=rvDZ1Ou6zk5jcJFOa7NSqA==:17 a=yUnIBFQkZM0A:10 a=hOpmn2quAAAA:8 a=3s95SbjHtX8A:10 a=pGLkceISAAAA:8 a=Cb6IxKRDAAAA:8 a=vV5ofkA0AAAA:8 a=mV9VRH-2AAAA:8 a=wP5SexboFy4gxY3A9WEA:9 a=CjuIK1q_8ugA:10 a=t0WRet2RL3kA:10 a=MSl-tDqOz04A:10 a=hUswqBWy9Q8A:10 a=gA6IeH5FQcgA:10 a=NWVoK91CQyQA:10 a=vUxBaVWd34dLkW9d:21 a=yjHSPwtTr7B4oAKG:21 a=cfAJnwag8jSUIRHl:21 a=yMhMjlubAAAA:8 a=SSmOFEACAAAA:8 a=YjOBGq4wAAAA:8 a=aAbPJnMWAAAA:8 a=jZJA2DfSiITGAa7wdI0A:9 a=gKO2Hq4RSVkA:10 a=UiCQ7L4-1S4A:10 a=hTZeC7Yk6K0A:10 a=frz4AuCg-hUA:10 a=tXsnliwV7b4A:10 a=24vWLqLtWfebA51r:21 a=w3KlTsukbNDwJNG5:21 a=9wLr0DvN45E1hxru:21 a=rvDZ1Ou6zk5jcJFOa7NSqA==:117 From: "Kevin Burton" To: Cc: "'Mohammad Tariq'" References: <030c01ce45c1$616e8240$244b86c0$@charter.net> <002b01ce45cd$7ab2e640$7018b2c0$@charter.net> In-Reply-To: Subject: RE: Can't initialize cluster Date: Tue, 30 Apr 2013 14:12:01 -0500 Message-ID: <006001ce45d6$97e2e3b0$c7a8ab10$@charter.net> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0061_01CE45AC.AF0FC1E0" X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQGtXmpV2Pjz2+tvfzJP+dUBvJFcUwIWtothAS33+3MCmVB1gJkB/chA Content-Language: en-us X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_0061_01CE45AC.AF0FC1E0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Tariq, Thank you. I tried this and the summary of the map reduce job looks like: 13/04/30 14:02:35 INFO mapred.JobClient: Job complete: job_201304301251_0004 13/04/30 14:02:35 INFO mapred.JobClient: Counters: 7 13/04/30 14:02:35 INFO mapred.JobClient: Job Counters 13/04/30 14:02:35 INFO mapred.JobClient: Failed map tasks=1 13/04/30 14:02:35 INFO mapred.JobClient: Launched map tasks=27 13/04/30 14:02:35 INFO mapred.JobClient: Rack-local map tasks=27 13/04/30 14:02:35 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=151904 13/04/30 14:02:35 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0 13/04/30 14:02:35 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/04/30 14:02:35 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 But there were a number of exceptions thrown and it seemed to take longer than just running it standalone (I should have at least 4 machines working on this). The exceptions are my main concern now: (there were quite a few) . . . . . 13/04/30 14:02:27 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000005_1, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/hadoop-core-2.0.0-mr1-cdh4.2.1.jar does not exist . . . . 13/04/30 14:02:28 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000006_1, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/guava-11.0.2.jar does not exist . . . . 13/04/30 14:02:28 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000008_0, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/zookeeper-3.4.5-cdh4.2.1.jar does not exist . . . . . 13/04/30 14:02:28 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000001_2, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/tools.jar does not exist . . . . . 13/04/30 14:02:28 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000000_2, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/Websters.txt does not exist . . . . 13/04/30 14:02:33 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000002_2, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/hadoop-hdfs-2.0.0-cdh4.2.1.jar does not exist . . . . 13/04/30 14:02:33 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000004_2, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/hadoop-common-2.0.0-cdh4.2.1.jar does not exist . . . . 13/04/30 14:02:33 INFO mapred.JobClient: Task Id : attempt_201304301251_0004_m_000003_2, Status : FAILED java.io.FileNotFoundException: File file:/home/kevin/WordCount/input/core-3.1.1.jar does not exist No output folder was created (probably because of the numerous errors). Kevin From: Mohammad Tariq [mailto:dontariq@gmail.com] Sent: Tuesday, April 30, 2013 1:32 PM To: Kevin Burton Subject: Re: Can't initialize cluster Hello again Kevin, Good that you are making progress. This is happening because when you are running it as a hadoop job, it looks for the the files in HDFS and when you run it as a job program it looks into the local FS. Use this as your input in your code and see if it helps : file:///home/kevin/input Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Tue, Apr 30, 2013 at 11:36 PM, Kevin Burton wrote: We/I are/am making progress. Now I get the error: 13/04/30 12:59:40 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/04/30 12:59:40 INFO mapred.JobClient: Cleaning up the staging area hdfs://devubuntu05:9000/data/hadoop/tmp/hadoop-mapred/mapred/staging/kevin/. staging/job_201304301251_0003 13/04/30 12:59:40 ERROR security.UserGroupInformation: PriviledgedActionException as:kevin (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://devubuntu05:9000/user/kevin/input Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://devubuntu05:9000/user/kevin/input When I run it with java -jar the input and output is the local folder. When running it with hadoop jar it seems to be expecting the folders (input and output) to be on the HDFS file system. I am not sure why these two methods of invocation don't make the same file system assumptions. It is hadoop jar WordCount.jar input output (which gives the above exception) versus java -jar WordCount.jar input output (which outputs the wordcount statistics to the output folder) This is run in the local /home/kevin/WordCount folder. Kevin From: Mohammad Tariq [mailto:dontariq@gmail.com] Sent: Tuesday, April 30, 2013 12:33 PM To: user@hadoop.apache.org Subject: Re: Can't initialize cluster Set "HADOOP_MAPRED_HOME" in your hadoop-env.sh file and re-run the job. See if it helps. Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Tue, Apr 30, 2013 at 10:10 PM, Kevin Burton wrote: To be clear when this code is run with 'java -jar' it runs without exception. The exception occurs when I run with 'hadoop jar'. From: Kevin Burton [mailto:rkevinburton@charter.net] Sent: Tuesday, April 30, 2013 11:36 AM To: user@hadoop.apache.org Subject: Can't initialize cluster I have a simple MapReduce job that I am trying to get to run on my cluster. When I run it I get: 13/04/30 11:27:45 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "devubuntu05:9001" 13/04/30 11:27:45 ERROR security.UserGroupInformation: PriviledgedActionException as:kevin (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. My core-site.xml looks like: fs.default.name hdfs://devubuntu05:9000 The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. So I am unclear as to why it is looking at devubuntu05:9001? Here is the code: public static void WordCount( String[] args ) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount "); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(WordCount.TokenizerMapper.class); job.setCombinerClass(WordCount.IntSumReducer.class); job.setReducerClass(WordCount.IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(job, new Path(otherArgs[0])); org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); Ideas? ------=_NextPart_000_0061_01CE45AC.AF0FC1E0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Tariq,

 

Thank you. I tried this and the summary of the map reduce job looks = like:

 

13/04/30 14:02:35 INFO mapred.JobClient: Job complete: = job_201304301251_0004

13/04/30 14:02:35 INFO mapred.JobClient: Counters: = 7

13/04/30 14:02:35 INFO mapred.JobClient:   Job = Counters

13/04/30 14:02:35 INFO mapred.JobClient:     = Failed map tasks=3D1

13/04/30 14:02:35 INFO mapred.JobClient:     = Launched map tasks=3D27

13/04/30 14:02:35 INFO mapred.JobClient:     = Rack-local map tasks=3D27

13/04/30 14:02:35 INFO mapred.JobClient:     = Total time spent by all maps in occupied slots = (ms)=3D151904

13/04/30 14:02:35 INFO mapred.JobClient:     = Total time spent by all reduces in occupied slots = (ms)=3D0

13/04/30 14:02:35 INFO mapred.JobClient:     = Total time spent by all maps waiting after reserving slots = (ms)=3D0

13/04/30 14:02:35 INFO mapred.JobClient:     = Total time spent by all reduces waiting after reserving slots = (ms)=3D0

 

But there were a number of exceptions thrown and it seemed to take = longer than just running it standalone (I should have at least 4 = machines working on this). The exceptions are my main concern = now:

 

(there were quite a few)

. . . . .

13/04/30 14:02:27 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000005_1, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/hadoop-core-2.0.0-mr1-cdh4.2.1.jar does = not exist

. . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000006_1, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/guava-11.0.2.jar does not = exist

. . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000008_0, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/zookeeper-3.4.5-cdh4.2.1.jar does not = exist

. . . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000001_2, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/tools.jar does not = exist

. . . . .

13/04/30 14:02:28 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000000_2, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/Websters.txt does not exist =

. . . .

13/04/30 14:02:33 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000002_2, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/hadoop-hdfs-2.0.0-cdh4.2.1.jar does not = exist

. . . .

13/04/30 14:02:33 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000004_2, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/hadoop-common-2.0.0-cdh4.2.1.jar does = not exist

.  . . .

13/04/30 14:02:33 INFO mapred.JobClient: Task Id : = attempt_201304301251_0004_m_000003_2, Status : = FAILED

java.io.FileNotFoundException: File = file:/home/kevin/WordCount/input/core-3.1.1.jar does not = exist

 

No output folder was created (probably because of the numerous = errors).

 

Kevin

 

From:= = Mohammad Tariq [mailto:dontariq@gmail.com]
Sent: Tuesday, = April 30, 2013 1:32 PM
To: Kevin Burton
Subject: Re: = Can't initialize cluster

 

Hello = again Kevin,

 

     Good that you are making progress. = This is happening because when you are running it as a hadoop job, it = looks for the the files in HDFS and when you run it as a job program it = looks into the local FS. Use this as your input in your code and see if = it helps : 

 

 


<= /div>

 

On Tue, Apr 30, 2013 at 11:36 PM, Kevin Burton <rkevinburton@charter.net> = wrote:

We/I are/am making progress. Now I get the = error:

 

13/04/30 12:59:40 WARN mapred.JobClient: Use GenericOptionsParser for = parsing the arguments. Applications should implement Tool for the = same.

13/04/30 12:59:40 INFO mapred.JobClient: Cleaning up the staging area = hdfs://devubuntu05:9000/data/hadoop/tmp/hadoop-mapred/mapred/staging/kevi= n/.staging/job_201304301251_0003

13/04/30 12:59:40 ERROR security.UserGroupInformation: = PriviledgedActionException as:kevin (auth:SIMPLE) = cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: = Input path does not exist: = hdfs://devubuntu05:9000/user/kevin/input

Exception in thread "main" = org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input = path does not exist: = hdfs://devubuntu05:9000/user/kevin/input

 

When I run it with java –jar the input and output is the local = folder. When running it with hadoop jar it seems to be expecting the = folders (input and output) to be on the HDFS file system. I am not sure = why these two methods of invocation don’t make the same file = system assumptions.

 

It is

 

hadoop jar WordCount.jar input output (which gives the above = exception)

 

versus

 

java –jar WordCount.jar input output (which outputs the = wordcount statistics to the output folder)

 

This is run in the local /home/kevin/WordCount = folder.

 

Kevin

 

From:= = Mohammad Tariq [mailto:dontariq@gmail.com]
Sent: Tuesday, = April 30, 2013 12:33 PM
To: user@hadoop.apache.org
Subject: Re: = Can't initialize cluster

 <= /o:p>

Set = "HADOOP_MAPRED_HOME" in your hadoop-env.sh file and re-run = the job. See if it helps.


<= /div>

 <= /p>

On Tue, Apr = 30, 2013 at 10:10 PM, Kevin Burton <rkevinburton@charter.net> = wrote:

To be clear when this code is run with = ‘java –jar’ it runs without exception. The exception = occurs when I run with ‘hadoop jar’.

 

From:= = Kevin Burton [mailto:rkevinburton@charter.net]
Sent: = Tuesday, April 30, 2013 11:36 AM
To: user@hadoop.apache.org
Subject: Can't = initialize cluster

 <= /o:p>

I have a = simple MapReduce job that I am trying to get to run on my cluster. When = I run it I get:

 <= /o:p>

13/04/30 = 11:27:45 INFO mapreduce.Cluster: Failed to use = org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: = Invalid "mapreduce.jobtracker.address" configuration value for = LocalJobRunner : "devubuntu05:9001"

13/04/30 = 11:27:45 ERROR security.UserGroupInformation: PriviledgedActionException = as:kevin (auth:SIMPLE) cause:java.io.IOException: Cannot initialize = Cluster. Please check your configuration for mapreduce.framework.name and the correspond server = addresses.

Exception = in thread "main" java.io.IOException: Cannot initialize = Cluster. Please check your configuration for mapreduce.framework.name and the correspond server = addresses.

 <= /o:p>

My = core-site.xml looks like:

 <= /o:p>

<property= >

  = <name>fs.default.name</name>

  = <value>hdfs://devubuntu05:9000</value>

  = <description>The name of the default file system. A URI whose = scheme and authority determine the FileSystem implementation. = </description>

</propert= y>

 <= /o:p>

So I am = unclear as to why it is looking at devubuntu05:9001?

 <= /o:p>

Here is the = code:

 <= /o:p>

  =   public static void WordCount( String[] args )  throws = Exception {

  =       Configuration conf =3D new = Configuration();

  =       String[] otherArgs =3D new = GenericOptionsParser(conf, args).getRemainingArgs();

  =       if (otherArgs.length !=3D 2) = {

  =           = System.err.println("Usage: wordcount <in> = <out>");

  =           = System.exit(2);

  =       }

  =       Job job =3D new Job(conf, "word = count");

  =       = job.setJarByClass(WordCount.class);

  =       = job.setMapperClass(WordCount.TokenizerMapper.class);

  =       = job.setCombinerClass(WordCount.IntSumReducer.class);

  =       = job.setReducerClass(WordCount.IntSumReducer.class);

  =       = job.setOutputKeyClass(Text.class);

  =       = job.setOutputValueClass(IntWritable.class);

  =       = org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(job, = new Path(otherArgs[0]));

  =       = org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(job= , new Path(otherArgs[1]));

  =       System.exit(job.waitForCompletion(true) ? = 0 : 1);

 <= /o:p>

Ideas?<= /o:p>

 <= /o:p>

 

------=_NextPart_000_0061_01CE45AC.AF0FC1E0--