hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5728) Check NPE for serializer/deserializer in MapTask
Date Wed, 22 Jan 2014 00:38:19 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878073#comment-13878073
] 

Jerry He commented on MAPREDUCE-5728:
-------------------------------------

Hi, [~qwertymaniac]

Looking the MAPREDUCE-2584, these are what you had:
1. added a warning in SerializationFactory, 
2. checked for NPE for getSerializer in MapTask,
3. added checkSerializerSpecs to check for serializer and derializer in JobSubmitter

Do you still want to keep all these extras?

I did a search in MapReduce project, here are all the callers of SerializationFactory.getSerializer(),
without including tests:
org.apache.hadoop.mapred.MapTask.MapOutputBuffer.init(Context)
org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.write(DataOutput)
org.apache.hadoop.mapreduce.lib.join.CompositeInputSplit.write(DataOutput)
org.apache.hadoop.mapreduce.task.ReduceContextImpl.ValueIterator.writeFirstKeyValueBytes(DataOutputStream)
org.apache.hadoop.mapreduce.split.JobSplitWriter.writeNewSplits(Configuration, T[], FSDataOutputStream)
org.apache.hadoop.mapred.IFile.Writer.Writer(Configuration, FSDataOutputStream, Class<K>,
Class<V>, CompressionCodec, Counter)

Callers of SerializationFactory.getdeserializer(), without including tests:
org.apache.hadoop.mapred.MapTask.getSplitDetails(Path, long)
org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(DataInput)
org.apache.hadoop.mapreduce.lib.join.CompositeInputSplit.readFields(DataInput)
org.apache.hadoop.mapreduce.task.ReduceContextImpl.ReduceContextImpl(Configuration, TaskAttemptID,
RawKeyValueIterator, Counter, Counter, RecordWriter<KEYOUT, VALUEOUT>, OutputCommitter,
StatusReporter, RawComparator<KEYIN>, Class<KEYIN>, Class<VALUEIN>)
org.apache.hadoop.mapred.Task.ValuesIterator.ValuesIterator(RawKeyValueIterator, RawComparator<KEY>,
Class<KEY>, Class<VALUE>, Configuration, Progressable)

Do you think we need to put in check for NPE in all these places?


> Check NPE for serializer/deserializer in MapTask
> ------------------------------------------------
>
>                 Key: MAPREDUCE-5728
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5728
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 2.2.0
>            Reporter: Jerry He
>            Assignee: Jerry He
>            Priority: Minor
>             Fix For: 2.3.0, 2.2.1
>
>         Attachments: MAPREDUCE-5728-trunk.patch
>
>
> Currently we will get NPE if the serializer/deserializer is not configured correctly.
> {code}
> 14/01/14 11:52:35 INFO mapred.JobClient: Task Id : attempt_201401072154_0027_m_000002_2,
Status : FAILED
> java.lang.NullPointerException
>         at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:944)
>         at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:672)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:740)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:368)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(AccessController.java:362)
>         at javax.security.auth.Subject.doAs(Subject.java:573)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> serializationFactory.getSerializer and serializationFactory.getDeserializer returns NULL
in this case.
> Let's check NPE for serializer/deserializer in MapTask so that we don't get meaningless
NPE.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message