crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-501) Object reuse issue in combineValues(Aggregator)
Date Tue, 24 Feb 2015 07:55:12 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334557#comment-14334557
] 

Gabriel Reid commented on CRUNCH-501:
-------------------------------------

+1 to the patch. It seems unfortunate that Aggregators.toCombineFn is public (and so necessitates
deprecation), but I guess there's no easy way around that.

About providing the PType for every DoFn, one problem I can see is that for shared instances
of DoFns (Identity.getInstance for example), there would be an issue because multiple PTypes
are used by the same instance. I'm not sure how often that would come up, but at least in
terms of IdentityFn I believe it would break.

[~aj987] is the use case for having access to the PType only related to detaching values,
or are there other reasons to need access to a PType in a DoFn?

> Object reuse issue in combineValues(Aggregator)
> -----------------------------------------------
>
>                 Key: CRUNCH-501
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-501
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Brandon Davis
>         Attachments: CRUNCH-501.patch
>
>
> I'm trying to use combineValues on a PGroupedTable. I am using Aggregators.FIRST_N. If
I have 20 keys in my PGroupedTable, then I only get 20 distinct values because the AggregatorCombineFn
and FirstNAggregator don't detach the values from the incoming iterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message