hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amir Youssefi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-3425) Partitioner Happilly accepts negative int number and data gets lost in Hadoop framework
Date Wed, 21 May 2008 00:10:55 GMT
Partitioner Happilly accepts negative int number and data gets lost in Hadoop framework
---------------------------------------------------------------------------------------

                 Key: HADOOP-3425
                 URL: https://issues.apache.org/jira/browse/HADOOP-3425
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Amir Youssefi


Using Partitioner, 

 If user passes negative partition number, framework happily accepts it. Data goes to wrong
location and (many) reducers get zero data.  Suggested resolutions:

 1) Prevent the problem from start. partitioner checks the range and throws an exception if
that' out of range.

 2) Have a more generic check: Compare counters to see if all data gets past Shuffle stage.
No leak. Per feedback we got from Owen, this idea get a bit complicated when considering having
combiners.

 Example:  using  my_id.hashCode() % numPartitions creates negative numbers and data gets
lost in the framework. Reducers get zero rows ( while data is actually in  partitions index
with negative numbers).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message