hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arpit Wanchoo <Arpit.Wanc...@guavus.com>
Subject Spill Failed when io.sort.mb is increased
Date Tue, 07 Aug 2012 05:31:11 GMT
Hi

I am facing this issue of spill failed when I increase the io.sort.mb to 1500 or 2000
It runs fine with 500 or 1000 but I get some spilled records ( 780 million spilled out of
total  5.3 billion map output records). 

I configured 9GB of VM to each mapper and configured 4 mapper on each node having 48GB of
RAM.
There was no heap space issue. I got the following error :

java.io.IOException: Spill failed
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1028)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:690)
	at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
	at com.guavus.mapred.bizreflex.job.BaseJob.Mapper.gleaning_cube(Mapper.java:450)
	at com.guavus.mapred.bizreflex.job.BaseJob.Mapper.netflow_mapper(Mapper.java:317)
	at com.guavus.mapred.bizreflex.job.BaseJob.Mapper.map(Mapper.java:387)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Unknown Source)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(Unknown Source)
	at com.guavus.mapred.common.collection.ValueCollection.readFields(ValueCollection.java:24)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
	at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116)
	at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
	at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1420)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1435)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:852)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1343)
I also increased the io.sort.factor from default(10) to 500 but still the error occurs.

Can someone comment what could be the possible reason fir this issue as it does not occur
for lower value of io.sort.mb.





Regards,
Arpit Wanchoo | Sr. Software Engineer
Guavus Network Systems.
6th Floor, Enkay Towers, Tower B & B1,Vanijya Nikunj, Udyog Vihar Phase - V, Gurgaon,Haryana.
Mobile Number +91-9899949788


Mime
View raw message