hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-7183) WritableComparator.get should not cache comparator objects
Date Fri, 11 Mar 2011 19:25:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom White updated HADOOP-7183:
------------------------------

    Attachment: HADOOP-7183.patch

The problem is that WritableComparator has a mutable field - DataInputBuffer buffer - which
is only used by WritableComparable implementations that *don't* override the optimized binary
compare method. IntWritable, Text, etc override this method, so there is no thread safety
issue for these.

The remedy is to only register comparators explicitly, i.e. not the "generic" ones, since
they may not be thread-safe. This is actually the behaviour that was in place before HADOOP-6881.

I've also updated the javadoc for WritableComparator.define to clarify that it should only
be called for thread-safe classes.


> WritableComparator.get should not cache comparator objects
> ----------------------------------------------------------
>
>                 Key: HADOOP-7183
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7183
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.20.3, 0.21.1, 0.22.0
>
>         Attachments: HADOOP-7183.patch
>
>
> HADOOP-6881 modified WritableComparator.get such that the constructed WritableComparator
gets saved back into the static map. This is fine for stateless comparators, but some comparators
have per-instance state, and thus this becomes thread-unsafe and causes errors in the shuffle
where multiple threads are doing comparisons. An example of a Comparator with per-instance
state is WritableComparator itself.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message