incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-96) Add secondary sort functionality to o.a.c.lib
Date Wed, 17 Oct 2012 19:24:04 GMT


Josh Wills commented on CRUNCH-96:

I ran into it on a machine learning project I was working on, and it seems to come up fairly
often in sessionization applications (e.g., group by user ID, sort events by timestamp), viz.,

Your point on naming well-taken: this isn't a total ordering on the keys, it's just a sort
on the values going into the reducer. Something more like GroupByKeyWithSecondarySort would
be more accurate (albeit more verbose.) Recommendations?
> Add secondary sort functionality to o.a.c.lib
> ---------------------------------------------
>                 Key: CRUNCH-96
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core, MapReduce Patterns
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>             Fix For: 0.4.0
>         Attachments: CRUNCH-96.patch
> I've been working on a problem that required a secondary sorting pattern that was very
similar to the example that Alex Kozlov created in CRUNCH-78, so it would be good to extract
the pattern from the example and move it to o.a.c.lib so it can be easily available to clients.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message