drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From F Méthot (JIRA) <j...@apache.org>
Subject [jira] [Updated] (DRILL-4609) Select true,true,true from ... does not always output true,true,true
Date Fri, 15 Apr 2016 14:44:25 GMT

     [ https://issues.apache.org/jira/browse/DRILL-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

F Méthot updated DRILL-4609:
----------------------------
    Description: 
Doing a simple "select true, true, true from table" won't output true,true,true on all generated
rows.

Step to reproduce.
generate a simple CSV files:
{code:sql}
      for i in {1..1000000}; do echo "Allo"; done > /users/fmethot/test.csv
{code}
Open a new fresh drill CLI.

Just to help for validation, switch output to CSV: 
{code:sql}
      alter session set `store.format`='csv' 
{code}
generate a table like this:

{code:sql}
       create table TEST_OUT as (select true,true,true,true from dfs.`/users/fmethot/test.csv')
{code}
Check content of /users/fmethot/test.csv
You will find false values in there!


If you generate another table, on the same session, the same way, chances are the value will
be fine (all true). We can only reproduce this on the first CTAS run. 

We came to test this select pattern after we realize our custom boolean UDF (as well as the
one provided in Drill like "ilike") were not outputting consistent deterministic results (same
input were implausibly generating random boolean output). We hope that fixing this ticket
will also fix our issue with boolean UDFs.

  was:
Doing a simple "select true, true, true from table" won't output true,true,true on all generated
rows.

Step to reproduce.
generate a simple CSV files:

      for i in {1..1000000}; do echo "Allo"; done > /users/fmethot/test.csv

Open a new fresh drill CLI.

Just to help for validation, switch output to CSV: 

      alter session set `store.format`='csv' 

generate a table like this:

       create table TEST_OUT as (select true,true,true,true from dfs.`/users/fmethot/test.csv')

Check content of /users/fmethot/test.csv
You will find false values in there!


If you generate another table, on the same session, the same way, chances are the value will
be fine (all true). We can only reproduce this on the first CTAS run. 

We came to test this select pattern after we realize our custom boolean UDF (as well as the
one provided in Drill like "ilike") were not outputting consistent deterministic results (same
input were implausibly generating random boolean output). We hope that fixing this ticket
will also fix our issue with boolean UDFs.


> Select true,true,true from ... does not always output true,true,true
> --------------------------------------------------------------------
>
>                 Key: DRILL-4609
>                 URL: https://issues.apache.org/jira/browse/DRILL-4609
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - CLI, Query Planning & Optimization, Storage - Writer
>    Affects Versions: 1.5.0, 1.6.0
>         Environment: Linux Redhat
> tested in cluster (hdfs) and embedded mode
>            Reporter: F Méthot
>
> Doing a simple "select true, true, true from table" won't output true,true,true on all
generated rows.
> Step to reproduce.
> generate a simple CSV files:
> {code:sql}
>       for i in {1..1000000}; do echo "Allo"; done > /users/fmethot/test.csv
> {code}
> Open a new fresh drill CLI.
> Just to help for validation, switch output to CSV: 
> {code:sql}
>       alter session set `store.format`='csv' 
> {code}
> generate a table like this:
> {code:sql}
>        create table TEST_OUT as (select true,true,true,true from dfs.`/users/fmethot/test.csv')
> {code}
> Check content of /users/fmethot/test.csv
> You will find false values in there!
> If you generate another table, on the same session, the same way, chances are the value
will be fine (all true). We can only reproduce this on the first CTAS run. 
> We came to test this select pattern after we realize our custom boolean UDF (as well
as the one provided in Drill like "ilike") were not outputting consistent deterministic results
(same input were implausibly generating random boolean output). We hope that fixing this ticket
will also fix our issue with boolean UDFs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message