crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Patel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-525) The ExtractKeyFn is has an incorrect scale factor
Date Fri, 22 May 2015 13:27:17 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556136#comment-14556136
] 

Stephen Patel commented on CRUNCH-525:
--------------------------------------

Yeah I don't see how to get PairMapFn's scale to be correct. I would think it should be something
like:

((keyMapFn.scale * keys.size) + (valMapFn.scale * vals.size))/(keys.size+vals.size)

but that's not possible.

> The ExtractKeyFn is has an incorrect scale factor
> -------------------------------------------------
>
>                 Key: CRUNCH-525
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-525
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.12.0
>            Reporter: Stephen Patel
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-525.patch
>
>
> The ExtractKeyFn[0] used by the by[1] method of the PCollectionImpl is using the default
scale factor for a MapFn (1.0).  It should be using 1.0 + the scale factor of the wrapped
MapFn, in order to be accurate.
> [0]: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/fn/ExtractKeyFn.java
> [1]: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/impl/dist/collect/PCollectionImpl.java#L270



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message