crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-280) Specify Comparator for total order sort
Date Thu, 17 Oct 2013 08:42:44 GMT


Chao Shi commented on CRUNCH-280:

I found it difficult that MR needs RawComparator, which compares two buffers of serialized
records. But this would be not easy to use. I would be nice to support:
1) RawComparator, this is the most efficient way, but users must know the serialization format
in mind
2) normal Comparator class (with extra record serialization overhead)
3) a serializable Comparator object, whose in-memory state is serialized to MR workers (with
serialization overhead)

I found 2) and 3) are not easy, as I don't know how to deserialize data at runtime. Is it
possible [~jwills]?

> Specify Comparator for total order sort
> ---------------------------------------
>                 Key: CRUNCH-280
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chao Shi
>            Assignee: Chao Shi
> It seems that Sort#sort can only uses the default comparator. It would be nice to make
it to be specified by clients. 

This message was sent by Atlassian JIRA

View raw message