crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: Use of getDettachedValue and Unit Tests
Date Fri, 22 May 2015 16:03:01 GMT
Hey Ron,

It's a little tricky, but yeah, I think we could add a mode that would
simulate object re-use (at the very least during the reduce phase, where it
tends to cause most problems.) File a JIRA for it?


On Fri, May 22, 2015 at 7:18 AM, Ron Hashimshony <> wrote:

> Hi,
> We love Crunch for the great Unit-Testing capabilities, which gives us a
> good confidence when running the pipeline on the real data.
> However, we did find one place in which the Unit-Tests failed us - when we
> need to add *getDettachedValue* call on *iterable*, unit-tests did behave
> differently and did not reuse the same objects, as happened in the
> production pipeline on the whole data.
> Is there any way to incorporate this validation in unit-tests?
> We have tests that run a set of *DoFn*s setting the input *MemCollection*s and
> check the *MemCollection*s out using *MemPipeline*, and other tests for a
> single *DoFn*'s *process* using *InMemoryEmitter*.
> Thanks,
> Ron Hashimshony

Director of Data Science
Cloudera <>
Twitter: @josh_wills <>

View raw message