hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 柳松 <lamfeel...@126.com>
Subject How to skip bad records in .19.1
Date Thu, 12 Mar 2009 09:15:52 GMT


Dear all:
    I have set the value "SkipBadRecords.setMapperMaxSkipRecords(conf, 1)",
and also the "SkipBadRecords.setAttemptsToStartSkipping(conf, 2)".
 
    However, after 3 failed attempts, it gave me this exception message:
 
   java.lang.NullPointerException
 at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
 at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:910)
 at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.(SequenceFile.java:1198)
 at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
 at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:306)
 at org.apache.hadoop.mapred.MapTask$SkippingRecordReader.writeSkippedRec(MapTask.java:265)
 at org.apache.hadoop.mapred.MapTask$SkippingRecordReader.next(MapTask.java:237)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
 at org.apache.hadoop.mapred.Child.main(Child.java:158)
 
   The last line of  syslog shows:
   2009-03-12 16:44:11,218 WARN org.apache.hadoop.mapred.SortedRanges: Skipping index 1-2
 
   I have two questions: 
   1. Should it skip the bad record automatically after 2 attempts? why it starts after 3?
 
   2. Why does the skip fail?
 
Regards
Song Liu from Suzhou University
 
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message