crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-525) The ExtractKeyFn is has an incorrect scale factor
Date Fri, 22 May 2015 07:17:17 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555729#comment-14555729
] 

Gabriel Reid commented on CRUNCH-525:
-------------------------------------

The changes to CompositeMapFn and ExtractKeyFn look good to me, but I don't fully get the
logic for the PairMapFn scale factor (max of the key and value scale factors). 

I assume it's impossible to do something that is really correct for this calculation, but
my first guess would to be something like the mean of the key and value MapFn scale factors
(which is probably even less correct). I also realize that I'm totally bike-shedding by bringing
this up. :-)

> The ExtractKeyFn is has an incorrect scale factor
> -------------------------------------------------
>
>                 Key: CRUNCH-525
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-525
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.12.0
>            Reporter: Stephen Patel
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-525.patch
>
>
> The ExtractKeyFn[0] used by the by[1] method of the PCollectionImpl is using the default
scale factor for a MapFn (1.0).  It should be using 1.0 + the scale factor of the wrapped
MapFn, in order to be accurate.
> [0]: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/fn/ExtractKeyFn.java
> [1]: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/impl/dist/collect/PCollectionImpl.java#L270



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message