hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <chri...@yahoo-inc.com>
Subject Re: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data
Date Sat, 06 Sep 2008 21:32:57 GMT
FWIW: HADOOP-3940 is merged into the 0.18 branch and should be part of  
0.18.1. -C

On Sep 4, 2008, at 6:33 AM, Devaraj Das wrote:

>> I started a profile of the reduce-task. I've attached the profiling  
>> output.
>> It seems from the samples that ramManager.waitForDataToMerge()  
>> doesn't
>> actually wait.
>> Has anybody seen this behavior.
>
> This has been fixed in HADOOP-3940
>
>
> On 9/4/08 6:36 PM, "Espen Amble Kolstad" <espen@trank.no> wrote:
>
>> I have the same problem on our cluster.
>>
>> It seems the reducer-tasks are using all cpu, long before there's  
>> anything to
>> shuffle.
>>
>> I started a profile of the reduce-task. I've attached the profiling  
>> output.
>> It seems from the samples that ramManager.waitForDataToMerge()  
>> doesn't
>> actually wait.
>> Has anybody seen this behavior.
>>
>> Espen
>>
>> On Thursday 28 August 2008 06:11:42 wangxu wrote:
>>> Hi,all
>>> I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar,
>>> and running hadoop on one namenode and 4 slaves.
>>> attached is my hadoop-site.xml, and I didn't change the file
>>> hadoop-default.xml
>>>
>>> when data in segments are large,this kind of errors occure:
>>>
>>> java.io.IOException: Could not obtain block:  
>>> blk_-2634319951074439134_1129
>>> file=/user/root/crawl_debug/segments/20080825053518/content/ 
>>> part-00002/data
>>> at
>>> org.apache.hadoop.dfs.DFSClient 
>>> $DFSInputStream.chooseDataNode(DFSClient.jav
>>> a:1462) at
>>> org.apache.hadoop.dfs.DFSClient 
>>> $DFSInputStream.blockSeekTo(DFSClient.java:1
>>> 312) at
>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java: 
>>> 1417) at
>>> java.io.DataInputStream.readFully(DataInputStream.java:178)
>>> at
>>> org.apache.hadoop.io.DataOutputBuffer 
>>> $Buffer.write(DataOutputBuffer.java:64
>>> ) at  
>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java: 
>>> 102)
>>> at
>>> org.apache.hadoop.io.SequenceFile 
>>> $Reader.readBuffer(SequenceFile.java:1646)
>>> at
>>> org.apache.hadoop.io.SequenceFile 
>>> $Reader.seekToCurrentValue(SequenceFile.ja
>>> va:1712) at
>>> org.apache.hadoop.io.SequenceFile 
>>> $Reader.getCurrentValue(SequenceFile.java:
>>> 1787) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceF
>>> ileRecordReader.java:104) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordRe
>>> ader.java:79) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordReader.
>>> java:112) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.join.WrappedRecordReader.accept(WrappedRecordReade
>>> r.java:130) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.join.CompositeRecordReader.fillJoinCollector(Compo
>>> siteRecordReader.java:398) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:5
>>> 6) at
>>> org 
>>> .apache 
>>> .hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:3
>>> 3) at
>>> org.apache.hadoop.mapred.MapTask 
>>> $TrackedRecordReader.next(MapTask.java:165)
>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>>> at org.apache.hadoop.mapred.TaskTracker 
>>> $Child.main(TaskTracker.java:2209)
>>>
>>>
>>> how can I correct this?
>>> thanks.
>>> Xu
>>
>
>


Mime
View raw message