orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <owen.omal...@gmail.com>
Subject Re: getting read past EOF for Double column
Date Mon, 18 Dec 2017 18:19:13 GMT
Actually, the metadata is reasonable, it is just that there is an array
above that column that doesn't have any elements.

So the tree down to column 36 looks like:

column 0: (struct) count: 42692
column 1: data (struct) count: 42692
column 21: listingAssociated (array) count: 42692
column 22: (struct) count: 0
column 32: sla (array) count: 0
column 33: (struct) count: 0
column 34: shippingTier (struct) count: 0
column 35: charge (struct) count: 0
column 36: amount (double) count: 0

since there are 0 instances of column 22, there aren't any instances below
that. So what should be happening is that the reader doesn't call down to
read the data because there are no values.

Which version of ORC are you using to read with?

Thanks,
   Owen


On Mon, Dec 18, 2017 at 5:38 AM, Piyush Mukati <piyush.mukati@gmail.com>
wrote:

> Hi,
> I have written one orc file with map-reduce job. But while reading the
> file I am getting "read past EOF for a double column".
> After debugging I found that we are trying to read an empty stream. I am
> suspecting the file meta to be corrupt.
>
> as the column meta says:
> *Column 36: count: 0 hasNull: false sum: 0.0*
> I am not able to understand how hasNull=false and count can be zero.
> while other columns have non zero counts.
>
> I am out of ideas on debugging.  Please help me with the direction I
> should debug  further.
> please find attached meta and the stackTarace.
> Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message