hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1126) shuffle should use serialization to get comparator
Date Tue, 08 Dec 2009 23:55:18 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787812#action_12787812
] 

Doug Cutting commented on MAPREDUCE-1126:
-----------------------------------------

I wonder if instead of adding more methods to JobContext we ought to add these to relevant
serialization implementations.  For example, we might have WritableSerialization.setMapOutputKeyClass(Class)
and AvroSerialization.setMapOutputKeySchema(Schema).  This makes the methods perhaps harder
for folks to find, but it bakes less into JobContext.  The serialization system is entirely
user code, so it seems reasonable that the kernel should not directly support it.  With this,
JobContext would only have serialization agnostic methods like get/setMapOutputKeySerializationMetadata()
and getMapOutputKeySerializer().


> shuffle should use serialization to get comparator
> --------------------------------------------------
>
>                 Key: MAPREDUCE-1126
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1126
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>            Reporter: Doug Cutting
>
> Currently the key comparator is defined as a Java class.  Instead we should use the Serialization
API to create key comparators.  This would permit, e.g., Avro-based comparators to be used,
permitting efficient sorting of complex data types without having to write a RawComparator
in Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message