drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5329) External sort does not support "obscure" numeric types
Date Wed, 08 Mar 2017 01:52:38 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Paul Rogers updated DRILL-5329:
-------------------------------
    Description: 
A unit test was created to exercise the "Sorter" mechanism within the External Sort, which
is used to sort each incoming batch. The sorter was tested with each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4
* UInt8
* Var16Char

The types that work include:

* Int
* BigInt
* Float4
* Float8
* VarChar

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting
in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 (which should
be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field.

A strong concern here is that there is no error or other warning to the user that Drill cannot
sort this type; Drill just silently declines to perform the operation.

  was:
A unit test was created to exercise the "Sorter" mechanism within the External Sort, which
is used to sort each incoming batch. The sorter was tested with each Drill data type.

The following types fail:

* TinyInt
* UInt1
* SmallInt
* UInt2
* UInt4

The types that work include:

* Int

The failure manifests on one of two ways:

* If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
* If dynamic UDFs are disabled, the generated code silently skips the comparison step, resulting
in the sort not actually being done:

Sorting a set of 20-pseudo-random rows produces the following output:

{code}
#, row #, key, value
0(0): 11, "0"
1(1): 14, "1"
2(2): 17, "2"
3(3): 0, "3"
{code}

By contrast, the (working) Int type produces the correct results:

{code}
#, row #, key, value
0(3): 0, "3"
1(10): 1, "10"
2(17): 2, "17"
3(4): 3, "4"
{code}

The first number is the row index, the second is the row pointed to by the sv2 (which should
be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field.

A strong concern here is that there is no error or other warning to the user that Drill cannot
sort this type; Drill just silently declines to perform the operation.


> External sort does not support "obscure" numeric types
> ------------------------------------------------------
>
>                 Key: DRILL-5329
>                 URL: https://issues.apache.org/jira/browse/DRILL-5329
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the External Sort,
which is used to sort each incoming batch. The sorter was tested with each Drill data type.
> The following types fail:
> * TinyInt
> * UInt1
> * SmallInt
> * UInt2
> * UInt4
> * UInt8
> * Var16Char
> The types that work include:
> * Int
> * BigInt
> * Float4
> * Float8
> * VarChar
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the comparison step,
resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the sv2 (which
should be written to create sort order). Sort was done ASC, NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user that Drill
cannot sort this type; Drill just silently declines to perform the operation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message