hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: I/O errors reading task output on 20.205.0
Date Mon, 02 Jan 2012 10:08:50 GMT
Markus,

Yeah, unfortunately 1.x/0.20.x it does not have a default XML entry/documentation on this.
But it is definitely present, via the same string as I'd posted before (note the typo in it,
it is intended).

Please log a JIRA for it to be added to the hdfs-default.xml file as well (patch against branch-1
is welcome too). For higher versions, this has already been documented (and renamed to a better
name).

On 02-Jan-2012, at 2:59 PM, Markus Jelsma wrote:

> Harsh,
> 
> Are you really sure it's there? When i check the job configuration through the 
> web gui it don't see it, it's not assigned a default value it seems.
> 
> Thanks
> 
> On Friday 30 December 2011 14:12:49 Harsh J wrote:
>> Yes your .205 release should have it. It should fix your issue!
>> 
>> On Fri, Dec 30, 2011 at 6:24 PM, Markus Jelsma
>> 
>> <markus.jelsma@openindex.io> wrote:
>>> Hi, (didn't reply to list before)
>>> 
>>>> Does your DN log show up any form of errors when you run into this?
>>> 
>>> Actually, i looked checked again to be sure and noticed errors that i
>>> didn't notice before:
>>> 
>>> 2011-12-29 19:51:01,799 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(141.105.120.152:50010,
>>> storageID=DS-454617998-141.105.120.152-50010-1324646606851,
>>> infoPort=50075, ipcPort=50020):DataXceiver
>>> java.io.IOException: xceiverCount 258 exceeds the limit of concurrent
>>> xcievers 256
>>>       at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:9
>>> 2) at java.lang.Thread.run(Thread.java:662)
>>> 
>>> but also this one:
>>> 
>>> 2011-12-29 19:51:00,675 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>>> DatanodeRegistration(141.105.120.152:50010,
>>> storageID=DS-454617998-141.105.120.152-50010-1324646606851,
>>> infoPort=50075, ipcPort=50020):DataXceiver
>>> java.io.EOFException
>>>       at java.io.DataInputStream.readShort(DataInputStream.java:298)
>>>       at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver
>>> .java:351) at
>>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:1
>>> 07) at java.lang.Thread.run(Thread.java:662)
>>> 
>>>> This happens with just two jobs reading how many files? And how many
>>>> DNs are these spread across?
>>> 
>>> One file, 15 parts spread across five machines.
>>> 
>>>> I'm thinking its probably something to do with your ulimits for the
>>>> running DN processes, but I can't say for sure without taking a look
>>>> at the logs.
>>> 
>>> Ulimits for open files is set to 16k for all machines.
>>> 
>>>> Some other stuff I can think of, a little blindly:
>>>> - What's your dfs.datanode.max.xcievers settings?
>>> 
>>> I don't know. I increased it for a 0.22.0 test cluster but this is
>>> 0.20.205.0 and i haven't seen that configuration directive in the manual
>>> for this version. At least not in the hdfs-, core or mapred-default
>>> files.
>>> 
>>>> - Can you ensure 'hadoop classpath' on all nodes reflects the same
>>>> output, and no accidental jar mixups?
>>> 
>>> They are identical. All machines were installed and configured
>>> automatically and looking at it i don't see any differences.
>>> 
>>> Is there such a max.xceivers setting in the 0.20.x branch? Judging from
>>> the exception it might be that's the problem.
>>> 
>>> Thanks!
>>> 
>>>> Does your DN log show up any form of errors when you run into this?
>>>> This happens with just two jobs reading how many files? And how many
>>>> DNs are these spread across?
>>>> 
>>>> I'm thinking its probably something to do with your ulimits for the
>>>> running DN processes, but I can't say for sure without taking a look
>>>> at the logs.
>>>> 
>>>> Some other stuff I can think of, a little blindly:
>>>> - What's your dfs.datanode.max.xcievers settings?
>>>> - Can you ensure 'hadoop classpath' on all nodes reflects the same
>>>> output, and no accidental jar mixups?
>>>> 
>>>> On Thu, Dec 29, 2011 at 11:48 PM, Markus Jelsma
>>>> 
>>>> <markus.jelsma@openindex.io> wrote:
>>>>> We just reproduced it (somehow) by running jobs concurrently reading
>>>>> the same data. Two out of three similar jobs died early in the map
>>>>> phase with Could not obtain block errors, one finished completely.
>>>>> 
>>>>> java.io.IOException: Could not obtain block:
>>>>> blk_119146860335302651_13067
>>>>> file=/user/systems/crawl/crawldb/current/part-00000/data
>>>>>       at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClie
>>>>> nt. java:2093) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.
>>>>> jav a:1897) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:20
>>>>> 48) at java.io.DataInputStream.readFully(DataInputStream.java:178) at
>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.ja
>>>>> va: 63) at
>>>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1937)
>>>>> at
>>>>> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2069)
>>>>> at
>>>>> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.nextKey
>>>>> Val ue(SequenceFileRecordReader.java:68) at
>>>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(
>>>>> Map Task.java:532) at
>>>>> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67
>>>>> ) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at
>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at
>>>>> org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at
>>>>> org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>       at java.security.AccessController.doPrivileged(Native Method)
>>>>>       at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>       at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformat
>>>>> ion .java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>> 
>>>>> Another job (different but reading the same data) finished the map
>>>>> phase but died partially (half of the reducers) and completely
>>>>> frooze.
>>>>> 
>>>>> 2011-12-29 18:07:58,899 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Exception in createBlockOutputStream java.io.EOFException
>>>>> 2011-12-29 18:07:58,899 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Abandoning block blk_4748641522370871094_13532
>>>>> 2011-12-29 18:07:58,900 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Excluding datanode 141.105.120.154:50010
>>>>> 2011-12-29 18:07:58,902 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Exception in createBlockOutputStream java.io.EOFException
>>>>> 2011-12-29 18:07:58,902 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Abandoning block blk_-1454920600140944030_13532
>>>>> 2011-12-29 18:07:58,903 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Excluding datanode 141.105.120.152:50010
>>>>> 2011-12-29 18:07:58,907 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Exception in createBlockOutputStream java.io.IOException: Bad connect
>>>>> ack with firstBadLink as 141.105.120.153:50010
>>>>> 2011-12-29 18:07:58,907 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Abandoning block blk_3551418605384221738_13532
>>>>> 2011-12-29 18:07:58,908 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Excluding datanode 141.105.120.153:50010
>>>>> 2011-12-29 18:07:58,910 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Exception in createBlockOutputStream java.io.EOFException
>>>>> 2011-12-29 18:07:58,910 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Abandoning block blk_-1826030182013954555_13532
>>>>> 2011-12-29 18:07:58,911 INFO org.apache.hadoop.hdfs.DFSClient:
>>>>> Excluding datanode 141.105.120.150:50010
>>>>> 2011-12-29 18:07:58,911 WARN org.apache.hadoop.hdfs.DFSClient:
>>>>> DataStreamer Exception: java.io.IOException: Unable to create new
>>>>> block. at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream
>>>>> (DF SClient.java:3213) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2300(DFSClient
>>>>> .ja va:2406) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC
>>>>> lie nt.java:2646)
>>>>> 
>>>>> 2011-12-29 18:07:58,912 WARN org.apache.hadoop.hdfs.DFSClient: Error
>>>>> Recovery for block blk_-1826030182013954555_13532 bad datanode[0]
>>>>> nodes == null 2011-12-29 18:07:58,912 WARN
>>>>> org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
>>>>> Source file "/user/systems/generate-
>>>>> temp-1325180944829/_temporary/_attempt_201112290956_0012_r_000004_0/f
>>>>> etch list-13/part-00004" - Aborting...
>>>>> 2011-12-29 18:07:59,049 INFO
>>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
>>>>> truncater with mapRetainSize=-1 and
>>>>> reduceRetainSize=-1 2011-12-29 18:07:59,062 WARN
>>>>> org.apache.hadoop.mapred.Child: Error running child
>>>>> java.io.EOFException
>>>>>       at java.io.DataInputStream.readShort(DataInputStream.java:298)
>>>>>       at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStre
>>>>> am( DFSClient.java:3272) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream
>>>>> (DF SClient.java:3196) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2300(DFSClient
>>>>> .ja va:2406) at
>>>>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC
>>>>> lie nt.java:2646) 2011-12-29 18:07:59,064 INFO
>>>>> org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>>>>> 
>>>>> It smells like the datanodes in 20.205.0 don't deal well with
>>>>> concurrent jobs, especially handling the same data.
>>>>> 
>>>>> Is there any advice for this? Again, this does not happen on 20.203.0.
>>>>> Many thanks
>>>>> 
>>>>>> I should add that the failing tasks that ran concurrently all read
>>>>>> the same map files from HDFS.
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> We just ran run large scale Apache Nutch jobs in our evaluation
of
>>>>>>> 20.205.0 and they all failed. Some of these jobs ran concurrently
>>>>>>> with the fair scheduler enabled. These were simple jobs consuming
>>>>>>> little RAM. I double checked and there were certainly no RAM
>>>>>>> issues.
>>>>>>> 
>>>>>>> All jobs failed and most tasks had a less than descriptive message.
>>>>>>> A few told they dealt with I/O errors reading task output.
>>>>>>> However, the data the read is fine. When we ran the same jobs
>>>>>>> manually (and some concurrently) some did fine and others died
for
>>>>>>> with I/O errors reading task output again!
>>>>>>> 
>>>>>>> The heap allocation for the reducers is not high but no OOM's
were
>>>>>>> reported. Besides the occasional I/O error, which i think is
>>>>>>> strange enough, most tasks did not write anything to the logs
that
>>>>>>> i can link to this problem.
>>>>>>> 
>>>>>>> We do not see this happening on our 20.203.0 cluster although
>>>>>>> resources and settings are different. 205 is a new high-end cluster
>>>>>>> with similar conservative settings but only more mappers/reducers
>>>>>>> per node. Resource settings are almost identical. The 203 cluster
>>>>>>> has three times as many machines so also more open file
>>>>>>> descriptors and threads.
>>>>>>> 
>>>>>>> Any thoughts to share?
>>>>>>> Thanks,
> 
> -- 
> Markus Jelsma - CTO - Openindex


Mime
View raw message