hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5571) TupleWritable can return incorrect results if it contains more than 32 values
Date Thu, 26 Mar 2009 20:56:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689653#action_12689653
] 

Chris Douglas commented on HADOOP-5571:
---------------------------------------

Sorry, I don't know what I was thinking about the unit test. +1 on the patch

bq. I was also thinking of raising a separate JIRA on replacing the written field in TupleWritable
with a java.util.BitSet so that you can do joins over 64 datasets - do you have an opinion
on this?

It might motivate some long-deferred work on memory consumption as well, but I think that's
a good idea.

> TupleWritable can return incorrect results if it contains more than 32 values
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-5571
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5571
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.1
>            Reporter: Jingkei Ly
>            Assignee: Jingkei Ly
>         Attachments: HADOOP-5571-1.patch
>
>
> When attempting to do an outer join on 45 files with the CompositeInputFormat, I've been
encountering unexpected results in the TupleWritable returned by the record reader. On closer
inspection, it seems to be because TupleWritable.setWritten(int) is incorrectly setting some
tuple positions as written, i.e when you set setWritten(42), it also sets position 10.
> The following Junit test demonstrates the problem:
> {code}
>   public void testWideTuple() throws Exception {
>     Text emptyText = new Text("Should be empty");
>     Writable[] values = new Writable[64];
>     Arrays.fill(values,emptyText);
>     values[42] = new Text("Number 42");
>                                      
>     TupleWritable tuple = new TupleWritable(values);
>     tuple.setWritten(42);
>     
>     for (int pos=0; pos<tuple.size();pos++) {
>       boolean has = tuple.has(pos);
>       if (pos == 42) {
>         assertTrue(has);
>       }
>       else {
>         assertFalse("Tuple position is incorrectly labelled as set: " + pos, has);
>       }
>     }
> }
> {code}
> Similarly, TupleWritable.setWritten(9) also causes TupleWritable.has(41) to incorrectly
return true.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message