hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris K Wensel (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3380) need comparators in serializer framework
Date Tue, 13 May 2008 18:49:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596492#action_12596492

Chris K Wensel commented on HADOOP-3380:

> BTW, those methods should both be altered to return RawComparator, not a WritableComparator,

I expect so.

Consider a key of type Tuple (a ComparableWritable type) that holds an arbitrary list of ComparableWritable

If I want fine grained ability to compare/sort these keys based on a runtime configuration,
I think I would be happy with providing a Configurable RawComparator class to the JobConf
during job setup.

Or are you suggesting best practice is to register a new TupleSerialization (that could subclass
WritableSerialization and return my fancy TupleComparator). 

Or should I have a TupleSerialization decorator that delegates to a configurable 'base' Serialization
(Text, Thrift, Writable, JSON, etc) but overrides Serialization#getComparator()?

Sorry, just trying to wrap my head around the proposed changes and their implications... I
still need to poke around and see the relationship with FileInput/OutputFormat classes...

> need comparators in serializer framework
> ----------------------------------------
>                 Key: HADOOP-3380
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3380
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Doug Cutting
> The new serialization framework permits Hadoop to incorporate different serialization
systems, including Hadoop's Writable, Thrift, Java Serialization, etc.  It provides a generic,
extensible means (SerializationFactory) to create serializers and deserializers for arbitrary
Java classes.  However it does not include a generic means to create comparators for these
classes.  Comparators are required for MapReduce keys and many other computations.  Thus we
should enhance the serialization framwork to provide comparators too.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message