incubator-crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Vohra <dvohr...@yahoo.com>
Subject Re: CRUNCH-140
Date Fri, 11 Jan 2013 21:09:45 GMT


The statement:

"The library is not compatible with versions of Hadoop prior to 1.0.x or 2.0.x,
such as version 0.20.x."

does mention the Hadoop version requirement, but should be modified to:

The library is not compatible with versions of Hadoop prior to 1.0.1 or 2.0.x,
such as version 1.0.0, 0.20.x.



________________________________
 From: Josh Wills <jwills@cloudera.com>
To: Deepak Vohra <dvohra10@yahoo.com> 
Cc: "crunch-user@incubator.apache.org" <crunch-user@incubator.apache.org> 
Sent: Friday, January 11, 2013 10:18:43 AM
Subject: Re: CRUNCH-140
 

Okay-- in that kind of environment, it's best to turn speculative execution off.



On Fri, Jan 11, 2013 at 10:17 AM, Deepak Vohra <dvohra10@yahoo.com> wrote:


>
>Yes. Single node  including a HBase cluster. Speculative execution is on by default,
haven't set it to false.
>
>
>________________________________
> From: Josh Wills <jwills@cloudera.com>
>
>To: crunch-user@incubator.apache.org; Deepak Vohra <dvohra10@yahoo.com> 
>Sent: Friday, January 11, 2013 9:43:34 AM
>Subject: Re: CRUNCH-140
> 
>
>
>Are you running on a single node w/speculative execution turned on?
>
>
>
>On Fri, Jan 11, 2013 at 9:29 AM, Deepak Vohra <dvohra10@yahoo.com> wrote:
>
>
>>
>>Hadoop 1.0.0, which supports multiple output, also generates the same error:
>>
>>
>>2013-01-11 17:21:20,368 INFO org.apache.crunch.impl.mr.run.RTNode: Crunch exception
in 'Text(crunch)' for input: [,61]
>>org.apache.crunch.impl.mr.run.CrunchRuntimeException: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /tmp/crunch-140947123/p1/output/_temporary/_attempt_201301111630_0001_r_000000_3/part-r-00000
for DFSClient_attempt_201301111630_0001_r_000000_3 on client 10.254.110.80 because current
leaseholder is trying to recreate file.
>>
>>
>>
>>________________________________
>> From: Josh Wills <josh.wills@gmail.com>
>>To: Deepak Vohra <dvohra10@yahoo.com> 
>>Cc: "crunch-user@incubator.apache.org" <crunch-user@incubator.apache.org> 
>>Sent: Thursday, January 10, 2013 5:13:17 PM
>>Subject: Re: CRUNCH-140
>> 
>>
>>
>>Most likely, yes.
>>
>>
>>
>>On Thu, Jan 10, 2013 at 5:12 PM, Deepak Vohra <dvohra10@yahoo.com> wrote:
>>
>>
>>>
>>>Would the HBase bug or error CRUNCH-141 be also because of the Hadoop version
being pre-1.0.x?
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>> From: Josh Wills <josh.wills@gmail.com>
>>>To: Deepak Vohra <dvohra10@yahoo.com> 
>>>Cc: "crunch-user@incubator.apache.org" <crunch-user@incubator.apache.org>

>>>Sent: Thursday, January 10, 2013 5:10:04 PM
>>>Subject: Re: CRUNCH-140
>>> 
>>>
>>>
>>>Hrm-- I think that's b/c you're using an earlier version of Hadoop (pre-1.0.x)
that doesn't support multiple outputs, which Crunch relies on.
>>>
>>>
>>>
>>>On Thu, Jan 10, 2013 at 5:08 PM, Deepak Vohra <dvohra10@yahoo.com> wrote:
>>>
>>>From userlogs:
>>>>
>>>>
>>>>WARN
org.apache.hadoop.mapred.Child: Error running child 
>>>>org.apache.crunch.impl.mr.run.CrunchRuntimeException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file
/tmp/crunch-531439899/p1/output/_temporary/_attempt_201301102312_0002_r_000000_3/part-r-00000
for DFSClient_attempt_201301102312_0002_r_000000_3 on client
10.210.42.32 because current leaseholder is trying to recreate file.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>> From: Josh Wills <josh.wills@gmail.com>
>>>>To: Deepak Vohra <dvohra10@yahoo.com> 
>>>>Sent: Thursday, January 10, 2013 4:36:36 PM
>>>>Subject: Re: CRUNCH-140
>>>> 
>>>>
>>>>
>>>>K-- any info in the logs for the job?
>>>>
>>>>
>>>>
>>>>On Thu, Jan 10, 2013 at 4:35 PM, Deepak Vohra <dvohra10@yahoo.com> wrote:
>>>>
>>>>Yes. The input is required to be on the HDFS.
>>>>>
>>>>>
>>>>>But, the Crunch job still fails with the following output.
>>>>>13/01/11
00:10:45 INFO collect.PGroupedTableImpl: Setting num reduce tasks to
1 
>>>>>13/01/11
00:10:48 INFO input.FileInputFormat: Total input paths to process :
1 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Running job
"org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)" 
>>>>>13/01/11
00:10:48 INFO exec.CrunchJob: Job status available at: http://ec2-50-19-55-40.compute-1.amazonaws.com:50030/jobdetails.jsp?jobid=job_201301102312_0002

>>>>>1
job failure(s) occurred: 
>>>>>org.apache.crunch.examples.WordCount:
Text(LICENSE.txt)+S0+Aggregate.count+GBK+combine+asText+Text(crunch)(class
org.apache.crunch.examples.WordCount0): Job failed! 
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>> From: Josh Wills <josh.wills@gmail.com>
>>>>>To: crunch-user@incubator.apache.org; Deepak Vohra <dvohra10@yahoo.com>

>>>>>Sent: Thursday, January 10, 2013 1:26:35 PM
>>>>>Subject: Re: CRUNCH-140
>>>>> 
>>>>>
>>>>>
>>>>>Does running:
>>>>>
>>>>>
>>>>>hadoop fs -put LICENSE.txt .
>>>>>
>>>>>
>>>>>followed by hadoop jar ... fix it?
>>>>>
>>>>>
>>>>>
>>>>>On Thu, Jan 10, 2013 at 12:07 PM, Deepak Vohra <dvohra10@yahoo.com>
wrote:
>>>>>
>>>>>The command/s to test bug
>>>>>>https://issues.apache.org/jira/browse/CRUNCH-140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549900#comment-13549900
>>>>>>
>>>>>>>cd /usr/local/apache-crunch-0.4.0-incubating-bin
>>>>>>>cp LICENSE LICENSE.txt
>>>>>>>sudo mkdir crunch
>>>>>>
>>>>>>>hadoop jar crunch-examples-0.4.0-incubating-job.jar org.apache.crunch.examples.WordCount
>>>>>>LICENSE.txt crunch  
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
>-- 
>
>Director of Data Science
>Cloudera
>Twitter: @josh_wills
>
>


-- 

Director of Data Science
Cloudera
Twitter: @josh_wills
Mime
View raw message