Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D445DD989 for ; Tue, 19 Mar 2013 12:15:19 +0000 (UTC) Received: (qmail 13284 invoked by uid 500); 19 Mar 2013 12:15:18 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 12887 invoked by uid 500); 19 Mar 2013 12:15:17 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 12826 invoked by uid 99); 19 Mar 2013 12:15:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Mar 2013 12:15:15 +0000 Date: Tue, 19 Mar 2013 12:15:15 +0000 (UTC) From: "David Parks (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-9295) AbstractMapWritable throws exception when calling readFields() multiple times when the maps contain different class types MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606271#comment-13606271 ] David Parks commented on HADOOP-9295: ------------------------------------- Your test missed the problem. You added two Text objects, you need to add 2 custom MapWrtiable objects to the map to trigger the problem. The reason explained: Notice this code in AbstractMapWritable: addToMap(ArrayWritable.class, Byte.valueOf(Integer.valueOf(-127).byteValue())); addToMap(BooleanWritable.class, Byte.valueOf(Integer.valueOf(-126).byteValue())); addToMap(BytesWritable.class, Byte.valueOf(Integer.valueOf(-125).byteValue())); addToMap(FloatWritable.class, Byte.valueOf(Integer.valueOf(-124).byteValue())); addToMap(IntWritable.class, Byte.valueOf(Integer.valueOf(-123).byteValue())); addToMap(LongWritable.class, Byte.valueOf(Integer.valueOf(-122).byteValue())); addToMap(MapWritable.class, Byte.valueOf(Integer.valueOf(-121).byteValue())); addToMap(MD5Hash.class, Byte.valueOf(Integer.valueOf(-120).byteValue())); addToMap(NullWritable.class, Byte.valueOf(Integer.valueOf(-119).byteValue())); addToMap(ObjectWritable.class, Byte.valueOf(Integer.valueOf(-118).byteValue())); addToMap(SortedMapWritable.class, Byte.valueOf(Integer.valueOf(-117).byteValue())); addToMap(Text.class, Byte.valueOf(Integer.valueOf(-116).byteValue())); addToMap(TwoDArrayWritable.class, Byte.valueOf(Integer.valueOf(-115).byteValue())); // UTF8 is deprecated so we don't support it addToMap(VIntWritable.class, Byte.valueOf(Integer.valueOf(-114).byteValue())); addToMap(VLongWritable.class, Byte.valueOf(Integer.valueOf(-113).byteValue())); It's adding the "typical" Writables to the class map by default, so any of these classes always maps correctly, this is probably why the problem was never noticed before now. It's only when you add a Writable object that isn't already in this list that it has to add it to the map, and thus encounters the bug. > AbstractMapWritable throws exception when calling readFields() multiple times when the maps contain different class types > ------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-9295 > URL: https://issues.apache.org/jira/browse/HADOOP-9295 > Project: Hadoop Common > Issue Type: Bug > Components: io > Affects Versions: 1.0.3 > Reporter: David Parks > Assignee: Karthik Kambatla > Priority: Critical > Attachments: MapWritableBugTest.java, test-hadoop-9295.patch > > > Verified the trunk looks the same as 1.0.3 for this issue. > When mappers output MapWritables with different class types, then they are read in on the Reducer via an iterator (multiple calls to readFields without instantiating a new object) you'll get this: > java.lang.IllegalArgumentException: Id 1 exists but maps to org.me.ClassTypeOne and not org.me.ClassTypeTwo > at org.apache.hadoop.io.AbstractMapWritable.addToMap(AbstractMapWritable.java:73) > at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:201) > It happens because AbstractMapWritable accumulates class type entries in its ClassType to ID (and vice versa) hashmaps. > Those accumulating classtype-to-id hashmaps need to be cleared to support multiple calls to readFields(). > I've attached a JUnit test that both demonstrates the problem and contains an embedded, fixed version of MapWritable and ArrayMapWritable (note the //TODO comments in the code where it was fixed in 2 places). > If there's a better way to submit this recommended bug fix, someone please feel free to let me know. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira