hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3380) need comparators in serializer framework
Date Tue, 13 May 2008 18:27:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596484#action_12596484

Doug Cutting commented on HADOOP-3380:

> What's the relationship between this proposal and JobConf#getOutputValueGroupingComparator()
and JobConf#getOutputKeyComparator()?

Those are ways to override the "natural" (or default) comparator under MapReduce.  This proposal
is about defining the natural comparator.  If we had a good configurable comparator, then
we perhaps wouldn't need those methods, but I'm not sure...  The framework might set io.comparator.context=grouping,
and then the configurable comparator implementation could use this to decide to use the user-specified
value of io.record.compare.grouping or somesuch.  Yuck!

BTW, those methods should both be altered to return RawComparator, not a WritableComparator,

> need comparators in serializer framework
> ----------------------------------------
>                 Key: HADOOP-3380
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3380
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Doug Cutting
> The new serialization framework permits Hadoop to incorporate different serialization
systems, including Hadoop's Writable, Thrift, Java Serialization, etc.  It provides a generic,
extensible means (SerializationFactory) to create serializers and deserializers for arbitrary
Java classes.  However it does not include a generic means to create comparators for these
classes.  Comparators are required for MapReduce keys and many other computations.  Thus we
should enhance the serialization framwork to provide comparators too.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message