hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris K Wensel (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3380) need comparators in serializer framework
Date Tue, 13 May 2008 18:49:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596492#action_12596492
] 

Chris K Wensel commented on HADOOP-3380:
----------------------------------------

> BTW, those methods should both be altered to return RawComparator, not a WritableComparator,
no?

I expect so.

Consider a key of type Tuple (a ComparableWritable type) that holds an arbitrary list of ComparableWritable
instances. 

If I want fine grained ability to compare/sort these keys based on a runtime configuration,
I think I would be happy with providing a Configurable RawComparator class to the JobConf
during job setup.

Or are you suggesting best practice is to register a new TupleSerialization (that could subclass
WritableSerialization and return my fancy TupleComparator). 

Or should I have a TupleSerialization decorator that delegates to a configurable 'base' Serialization
(Text, Thrift, Writable, JSON, etc) but overrides Serialization#getComparator()?

Sorry, just trying to wrap my head around the proposed changes and their implications... I
still need to poke around and see the relationship with FileInput/OutputFormat classes...



> need comparators in serializer framework
> ----------------------------------------
>
>                 Key: HADOOP-3380
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3380
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Doug Cutting
>
> The new serialization framework permits Hadoop to incorporate different serialization
systems, including Hadoop's Writable, Thrift, Java Serialization, etc.  It provides a generic,
extensible means (SerializationFactory) to create serializers and deserializers for arbitrary
Java classes.  However it does not include a generic means to create comparators for these
classes.  Comparators are required for MapReduce keys and many other computations.  Thus we
should enhance the serialization framwork to provide comparators too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message