hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruba Borthakur" <dhr...@gmail.com>
Subject Re: Could not obtain block: blk_-2634319951074439134_1129 file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data
Date Sun, 07 Sep 2008 07:42:43 GMT
The DFS errors might have been caused by

http://issues.apache.org/jira/browse/HADOOP-4040

thanks,
dhruba

On Sat, Sep 6, 2008 at 6:59 AM, Devaraj Das <ddas@yahoo-inc.com> wrote:
> These exceptions are apparently coming from the dfs side of things. Could
> someone from the dfs side please look at these?
>
>
> On 9/5/08 3:04 PM, "Espen Amble Kolstad" <espen@trank.no> wrote:
>
>> Hi,
>>
>> Thanks!
>> The patch applies without change to hadoop-0.18.0, and should be
>> included in a 0.18.1.
>>
>> However, I'm still seeing:
>> in hadoop.log:
>> 2008-09-05 11:13:54,805 WARN  dfs.DFSClient - Exception while reading
>> from blk_3428404120239503595_2664 of
>> /user/trank/segments/20080905102650/crawl_generate/part-00010 from
>> somehost:50010: java.io.IOException: Premeture EOF from in
>> putStream
>>
>> in datanode.log:
>> 2008-09-05 11:15:09,554 WARN  dfs.DataNode -
>> DatanodeRegistration(somehost:50010,
>> storageID=DS-751763840-somehost-50010-1219931304453, infoPort=50075,
>> ipcPort=50020):Got exception while serving
>> blk_-4682098638573619471_2662 to
>> /somehost:
>> java.net.SocketTimeoutException: 480000 millis timeout while waiting
>> for channel to be ready for write. ch :
>> java.nio.channels.SocketChannel[connected local=/somehost:50010
>> remote=/somehost:45244]
>>
>> These entries in datanode.log happens a few minutes apart repeatedly.
>> I've reduced # map-tasks so load on this node is below 1.0 with 5GB of
>> free memory (so it's not resource starvation).
>>
>> Espen
>>
>> On Thu, Sep 4, 2008 at 3:33 PM, Devaraj Das <ddas@yahoo-inc.com> wrote:
>>>> I started a profile of the reduce-task. I've attached the profiling output.
>>>> It seems from the samples that ramManager.waitForDataToMerge() doesn't
>>>> actually wait.
>>>> Has anybody seen this behavior.
>>>
>>> This has been fixed in HADOOP-3940
>>>
>>>
>>> On 9/4/08 6:36 PM, "Espen Amble Kolstad" <espen@trank.no> wrote:
>>>
>>>> I have the same problem on our cluster.
>>>>
>>>> It seems the reducer-tasks are using all cpu, long before there's anything
>>>> to
>>>> shuffle.
>>>>
>>>> I started a profile of the reduce-task. I've attached the profiling output.
>>>> It seems from the samples that ramManager.waitForDataToMerge() doesn't
>>>> actually wait.
>>>> Has anybody seen this behavior.
>>>>
>>>> Espen
>>>>
>>>> On Thursday 28 August 2008 06:11:42 wangxu wrote:
>>>>> Hi,all
>>>>> I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar,
>>>>> and running hadoop on one namenode and 4 slaves.
>>>>> attached is my hadoop-site.xml, and I didn't change the file
>>>>> hadoop-default.xml
>>>>>
>>>>> when data in segments are large,this kind of errors occure:
>>>>>
>>>>> java.io.IOException: Could not obtain block: blk_-2634319951074439134_1129
>>>>> file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data
>>>>> at
>>>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.jav
>>>>> a:1462) at
>>>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1
>>>>> 312) at
>>>>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1417)
at
>>>>> java.io.DataInputStream.readFully(DataInputStream.java:178)
>>>>> at
>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64
>>>>> ) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1646)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.ja
>>>>> va:1712) at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:
>>>>> 1787) at
>>>>> org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceF
>>>>> ileRecordReader.java:104) at
>>>>> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordRe
>>>>> ader.java:79) at
>>>>> org.apache.hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordReader.
>>>>> java:112) at
>>>>> org.apache.hadoop.mapred.join.WrappedRecordReader.accept(WrappedRecordReade
>>>>> r.java:130) at
>>>>> org.apache.hadoop.mapred.join.CompositeRecordReader.fillJoinCollector(Compo
>>>>> siteRecordReader.java:398) at
>>>>> org.apache.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:5
>>>>> 6) at
>>>>> org.apache.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:3
>>>>> 3) at
>>>>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165)
>>>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>>>>> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
>>>>>
>>>>>
>>>>> how can I correct this?
>>>>> thanks.
>>>>> Xu
>>>>
>>>
>>>
>>>
>
>
>

Mime
View raw message