crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-183) Reservoir sampling functions don't take object reuse into account
Date Sun, 24 Mar 2013 17:13:15 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13612154#comment-13612154
] 

Josh Wills commented on CRUNCH-183:
-----------------------------------

I am the worst about that-- thanks Gabriel. +1.
                
> Reservoir sampling functions don't take object reuse into account
> -----------------------------------------------------------------
>
>                 Key: CRUNCH-183
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-183
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>         Attachments: CRUNCH-183.patch
>
>
> ReservoirSampleFn and WRSCombineFn in o.a.c.lib.SampleUtils both hold onto references
of processed values, but don't make deep copies of them. For complex objects such as Avro
objects, this leads to incorrect results, with the same value being returned for all samples.
> This can be resolved by making use of PType#getDetachedValue before storing a reference
to the object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message