incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-90) Object reuse is not accounted for in mapper fusion
Date Mon, 08 Oct 2012 01:00:15 GMT


Josh Wills updated CRUNCH-90:

    Attachment: CRUNCH-90-reflect.patch

Attached a patch that fixes the PageRankClassTest. In order to do it, I had to besmirch the
really elegant DeepCopier infrastructure (like, seriously elegant-- I haven't looked at it
that closely before and it was a joy to read) by passing Configuration objects all over the
place so that the Avro deepCopy's will use the correct instance of ReflectData-- the Scrunch
stuff has to use a different version than the Java stuff.
> Object reuse is not accounted for in mapper fusion
> --------------------------------------------------
>                 Key: CRUNCH-90
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>             Fix For: 0.4.0
>         Attachments: CRUNCH-90.patch, CRUNCH-90-reflect.patch
> When multiple DoFns are run over the same output (i.e. in the case of mapper fusion),
the same value object is passed to multiple underlying DoFns. If the state of that value object
is changed by one DoFn, other DoFns are called with the updated object.
> This is a situation that can happen quite easily when the input of a DoFn is simply updated
and then emitted. In general, this bug will only affect values whose type is the same as the
underlying serialization type.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message