hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-7183) WritableComparator.get should not cache comparator objects
Date Tue, 15 Mar 2011 20:50:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007172#comment-13007172

Todd Lipcon commented on HADOOP-7183:

bq. This behavior wasn't changed by HADOOP-6881

It looks to me like it was. Prior to 6881, if nothing was in the cache, it did not "write
back" to the cache in WritableComparator.get(). 6881 changed it to add {{comparators.put}}

bq. Have you actually verified that the shuffle's sorts actually do a get per a thread?

Not in the latest shuffle code, but it looked that way to me previously.

Using ThreadLocals seems like a reasonable alternate implementation. But we should still make
the contract explicit that cached comparators must be threadsafe (yes, it's fairly obvious,
but we seem to have missed it ourselves!)

> WritableComparator.get should not cache comparator objects
> ----------------------------------------------------------
>                 Key: HADOOP-7183
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7183
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Tom White
>            Priority: Blocker
>             Fix For: 0.20.3, 0.21.1, 0.22.0
>         Attachments: HADOOP-7183.patch
> HADOOP-6881 modified WritableComparator.get such that the constructed WritableComparator
gets saved back into the static map. This is fine for stateless comparators, but some comparators
have per-instance state, and thus this becomes thread-unsafe and causes errors in the shuffle
where multiple threads are doing comparisons. An example of a Comparator with per-instance
state is WritableComparator itself.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message