crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <>
Subject Re: crunch 0.8.2+6-cdh4.4.0
Date Wed, 29 Jan 2014 22:30:48 GMT
That actually seems likely as the persistence of fixed bytes in the
Buffered vs Direct encoders differs.

Stephen can you include the full stack trace of the NPE which will help to
verify if differences in the encoders are at fault.

On Wed, Jan 29, 2014 at 4:14 PM, Josh Wills <> wrote:

> Maybe the binary encoder change?
> On Jan 29, 2014 3:19 PM, "Durfey,Stephen" <>
> wrote:
>>   Sorry for the initial confusion. The exceptions that I am seeing, look
>> like were caused by another exception way up in my console log (that I
>> originally missed). I think the true exception is an Avro exception. It is
>> an Avro PType, and the original NPE is coming from the
>> GenericData#getField,
>> when the GenericDatumWriter is serializing. The exception is for a null
>> value in the org.apache.avro.mapred.Pair (I believe this Pair is created in
>> PairMapFn?)  object when expecting a specific type of Avro. I'm having a
>> difficult time figuring out what changed between versions. Without changing
>> anything in my code, and just changing Crunch versions causes these
>> exceptions to be thrown.
>>  - Stephen
>>   From: Josh Wills <>
>> Reply-To: "" <>
>> Date: Tuesday, January 28, 2014 at 4:08 PM
>> To: "" <>
>> Subject: Re: crunch 0.8.2+6-cdh4.4.0
>>   Hey Stephen,
>>  Slightly confused here, question inlined.
>> On Tue, Jan 28, 2014 at 12:59 PM, Durfey,Stephen <
>>> wrote:
>>>  This question is specifically about this version maintained by
>>> Cloudera.
>>>  I was looking to update out Crunch version from 0.8.0-cdh4.3.0
>>> to 0.8.2+6-cdh4.4.0. In the process some of our tests starting failing from
>>> NullPointerExceptions. I've discovered why these exceptions are happening,
>>> but I'm having trouble tracking down the where.
>>>  The exceptions occur when we emit a Pair<POJO, null> that uses an Avro
>>> PType. Previously this worked just fine, and by the time the CrunchOutputs
>>> started writing to a sequence file the value would be an instance of
>>> NullWritable, and it would successfully pull off the output type for
>>> serialization (in SequenceFile.BlockCompressWriter#append(k, v)). After the
>>> version change the value when it got down to write to a sequence file was
>>> 'null', rather than NullWritable.
>>  It's an AvroType that's getting written to a Sequence File? Is that
>> right?
>>>  Any thoughts?
>>>  - Stephen
>>> CONFIDENTIALITY NOTICE This message and any included attachments are
>>> from Cerner Corporation and are intended only for the addressee. The
>>> information contained in this message is confidential and may constitute
>>> inside or non-public information under international, federal, or state
>>> securities laws. Unauthorized forwarding, printing, copying, distribution,
>>> or use of such information is strictly prohibited and may be unlawful. If
>>> you are not the addressee, please promptly delete this message and notify
>>> the sender of the delivery error by e-mail or you may call Cerner's
>>> corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

View raw message