drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5329) External sort does not support "obscure" data types
Date Thu, 09 Mar 2017 05:06:37 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Paul Rogers updated DRILL-5329:
-------------------------------
    Summary: External sort does not support "obscure" data types  (was: External sort does
not support "obscure" numeric types)

> External sort does not support "obscure" data types
> ---------------------------------------------------
>
>                 Key: DRILL-5329
>                 URL: https://issues.apache.org/jira/browse/DRILL-5329
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the External Sort,
which is used to sort each incoming batch. The sorter was tested with each Drill data type.
> The following types fail:
> * TINYINT
> * UINT1
> * SMALLINT
> * UINT2
> * UINT4
> * UINT8
> * VAR16CHAR
> * DECIMAL28SPARSE
> * DECIMAL38SPARSE
> The types that work include:
> * INT
> * BIGINT
> * FLOAT4
> * FLOAT8
> * DECIMAL9
> * DECIMAL18
> * VARCHAR
> * VARBINARY
> * DATE
> * TIME
> * TIMESTAMP
> * INTERVALYEAR
> Could not find a way to test the following:
> * DECIMAL28DENSE
> * DECIMAL38DENSE
> * LIST
> * GENERIC_OBJECT
> * UNION
> * INTERVAL
> * INTERVALDAY
> Not yet supported in Drill:
> * MONEY
> * FIXEDCHAR
> * FIXED16CHAR
> * FIXEDBINARY
> * NULL
> * TIMETZ
> * TIMESTAMPTZ
> * LATE
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the comparison step,
resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the sv2 (which
should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user that Drill
cannot sort this type; Drill just silently declines to perform the operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message