crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: Joins and null values
Date Wed, 18 Feb 2015 21:56:21 GMT
Hey Bryan,

I like the idea of throwing exceptions when there are null values in one of
the collections in a join. Not sure if there are any other implications of
that I should think through first.

On the convenience methods for PCollection joins, what do you have in mind?


On Wed, Feb 18, 2015 at 12:35 PM, Bryan Baugher <> wrote:

> Hi everyone,
> The other day I ran into the issue mentioned here[1] about joining data
> with null values. This took awhile to figure out until I broke down and
> went to look at the docs to see if I was doing something obviously wrong. I
> used null values because I'm basically wanting to join two pcollections.
> Can crunch either throw an exception or log errors if I do something like
> this? Similarly would it be possible to get convenience methods for doing
> joins on PCollections?
> [1] -

Director of Data Science
Cloudera <>
Twitter: @josh_wills <>

View raw message