hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ning Zhang <>
Subject Re: How are nulls represented in data?
Date Mon, 09 Aug 2010 18:46:56 GMT
How it is serialized/deserialized is determined by specific serde. NULL is serialized as \N
by SimpleLazySerDe (default serde for text). RCFile (ColumnarSerDe) uses the same default
parameters as LazySimpleSerDe.

Unless I missed something, NULL serialization/deserialization is type independent (at least
in LazySimpleSerDe).

On Aug 9, 2010, at 9:42 AM, Pradeep Kamath wrote:

   What value does hive expect in the data for a column to be treated as null? I tried some
permutations on a text data based table but couldn’t figure out what the correct representation
was. I tried empty string, the string NULL and the string null for a string column and in
all three cases the “is null” operator returned false.

A couple of related questions:
 - Does the representation of null depend on the type of the column – is it different for
string Vs non-string columns?
 - Is the representation of null different for different storage formats – text Vs RCFile
Vs SequenceFile – I am particularly interested in text and RCFile.

Thanks in advance,


View raw message