drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@apache.org>
Subject Re: [DISCUSS] Change default json read behavior for numbers
Date Mon, 26 Jan 2015 22:17:54 GMT
Writing zero int to a float column should be allowed.  Basically, if we
found a float previously and then we run across a zero, that should be
accepted.  This doesn't fix the situation where the first value was zero
but definitely fixes many situations.  I'm up for a second option to treat
all numbers as doubles but I'm not in support of it for the default as once
we finish embedded types, this would be our desired behavior.

On Mon, Jan 26, 2015 at 1:36 PM, Jason Altekruse <altekrusejason@gmail.com>

> Hello Drillers,
> I am currently working on improving the error reporting in the JSON reader
> to help users with files that Drill cannot read using the default
> configuration today.
> As a part of this change I think it may be useful to change the default
> behavior for reading numbers in JSON documents. Currently we fail on a
> simple case with reading numbers with decimal points and then hit a value
> of 0 (or any number without a decimal point) in a later record. The reason
> for the current behavior is to allow better precision in the case of files
> with only integers. The issue however is that we currently fail on the
> basic case with a mix of intergers and decimal numbers. See [1] for more
> discussion on this.
> I propose that we switch the JSON reader to read all numbers as doubles by
> default. The reader already contains a workaround that allows lossless
> casting to integers and decimal types with some extra computational
> overhead using all_text_mode, see more info below. [2]
> Please share your thoughts on this change.
> [1] https://issues.apache.org/jira/browse/DRILL-1460
> [2] https://issues.apache.org/jira/browse/DRILL-2071
> -Jason

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message