hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael White (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-2597) Map-side joins always empty when an input has NullWritable value
Date Wed, 15 Jun 2011 18:36:48 GMT
Map-side joins always empty when an input has NullWritable value
----------------------------------------------------------------

                 Key: MAPREDUCE-2597
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2597
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.2
         Environment: Linux cluster
            Reporter: Michael White


It is not uncommon to have a sorted list of data that has no specific value associated with
it as input to a map-side join, e.g. as an exact-match filter.  In these cases, you would
typically have a value class of NullWritable.  However, when performing a map-side join in
Hadoop 0.20.2, we have found that any input that has value class of NullWritable results in
the Mapper never getting called.  I found this with a 3-way map-side join, and my colleague
tells me he ran into the same issue.  I have not specifically tested a 2-way join to see if
the problem occurs, so it may be that the bug is specific to n-way joins for n>2 (though
I suspect not).

The current workaround is to use some other value type (e.g. IntWritable) and stuff an arbitrary
value into it.

For a join, the value class should have no bearing on the set of keys that are considered
matching.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message