impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksei Maželis <olm...@gmail.com>
Subject Re: Querying column of double data type from (Kudu-based) table: artifact digits
Date Tue, 12 Sep 2017 12:06:03 GMT
Ok, in the absence of Decimal support in Kudu ATM, I guess casting is the
only way. Thanks for elaborating!

Regards,
Aleksei

On Mon, Sep 11, 2017 at 7:21 PM, Alexander Behm <alex.behm@cloudera.com>
wrote:

> Like Jim said, this is purely a "display" issue. If you want a precise
> representation you can cast to a decimal:
>
> select cast(col as decimal(4, 2)) from test
>
> On Mon, Sep 11, 2017 at 8:53 AM, Jim Apple <jbapple@cloudera.com> wrote:
>
>> 68.01 is not exactly representable in IEEE double-precision floating
>> point. In Python:
>>
>> >>> 68.01 * 100 - 6801
>> 9.094947017729282e-13
>>
>> One workaround is to use integers and store 6801 instead of 68.01. In
>> Impala, in non-Kudu tables, you can use DECIMAL.
>>
>>
>> On Mon, Sep 11, 2017 at 12:37 AM, Aleksei Maželis <olmazh@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> While querying, via Impala, a double-type data column from Kudu table, I
>>> have noticed that the original value is sometimes miss-represented with an
>>> artifact digit. Say, whereas "68.01" is stored in a column, querying it
>>> returns "68.01000000000001". What would be the proper way of handling (or
>>> going around) that?
>>>
>>> This can be reproduced e.g. as:
>>>
>>> > create TABLE test
>>> (
>>>   col1 BIGINT PRIMARY KEY,
>>>   col2 STRING,
>>>   col3 double
>>> ) PARTITION BY HASH(col1) PARTITIONS 2
>>> STORED AS KUDU TBLPROPERTIES ('kudu.master_addresses'='kudu-docker:7051',
>>> 'kudu.num_tablet_replicas' = '1', 'kudu.table_name'='test')
>>>
>>> > describe test;
>>> Query: describe test
>>> +------+--------+---------+-------------+----------+--------
>>> -------+---------------+---------------------+------------+
>>> | name | type   | comment | primary_key | nullable | default_value |
>>> encoding      | compression         | block_size |
>>> +------+--------+---------+-------------+----------+--------
>>> -------+---------------+---------------------+------------+
>>> | col1 | bigint |         | true        | false    |               |
>>> AUTO_ENCODING | DEFAULT_COMPRESSION | 0          |
>>> | col2 | string |         | false       | true     |               |
>>> AUTO_ENCODING | DEFAULT_COMPRESSION | 0          |
>>> | col3 | double |         | false       | true     |               |
>>> AUTO_ENCODING | DEFAULT_COMPRESSION | 0          |
>>> +------+--------+---------+-------------+----------+--------
>>> -------+---------------+---------------------+------------+
>>>
>>> > insert into test values (1, "test", 68.01);
>>>
>>> > select * from test;
>>> Query: select * from test
>>> +------+------+-------------------+
>>> | col1 | col2 | col3              |
>>> +------+------+-------------------+
>>> | 1    | test | 68.01000000000001 |
>>> +------+------+-------------------+
>>>
>>> Finally, checking that the digit at the end is an artifact:
>>>
>>> > select (col3-68.01)*10e20 from test;
>>> Query: select (col3-68.01)*10e20 from test
>>> +--------------------------+
>>> | (col3 - 68.01) * 1.0e+21 |
>>> +--------------------------+
>>> | 0                        |
>>> +--------------------------+
>>>
>>> Regards,
>>> Aleksei Maželis
>>>
>>
>>
>

Mime
View raw message