drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5109) CTAS queries for window functions creating files without column names
Date Tue, 06 Dec 2016 06:06:58 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nitin updated DRILL-5109:
-------------------------
    Description: 
Following query when executed,
0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t` as select cast(employee_id as double)
employee_id, cast(department_id as double) department_id,cast(salary as double) salary,DENSE_RANK(
) over( partition by cast(department_id as double)  order by cast(salary as double) asc  nulls
first  )  dummy_DENSE_RANK from cp.`employee.json`;

0: jdbc:drill:zk=local> select * from dfs.tmp.`/tmp/t` limit 5;
+------+------+----------+-----+
|  $0  |  $1  |    $2    | $3  |+
------+------+----------+-----+
| 1.0  | 1.0  | 80000.0  | 4   |
| 2.0  | 1.0  | 40000.0  | 3   |
| 4.0  | 1.0  | 40000.0  | 3   |
| 5.0  | 1.0  | 35000.0  | 2   |
| 6.0  | 2.0  | 25000.0  | 4   |
+------+------+----------+-----+i

t should have had the proper column names. even from parquet schema it comes as 
bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar schema /tmp/tmp/t/0_0_0.parquet 
message root { 
     optional double $0;  
     optional double $1; 
     optional double $2;  
      required int64 $3;
}
But when we add order by clause in query it is adding column names looks like an issue with
storage writer. This is true for all cases whichever file format we choose to store as for
CTAS

  was:Following query when executed,0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t`
as select cast(employee_id as double) employee_id, cast(department_id as double) department_id,cast(salary
as double) salary,DENSE_RANK( ) over( partition by cast(department_id as double)  order by
cast(salary as double) asc  nulls first  )  dummy_DENSE_RANK from cp.`employee.json`0: jdbc:drill:zk=local>
select * from dfs.tmp.`/tmp/t` limit 5;+------+------+----------+-----+|  $0  |  $1  |   
$2    | $3  |+------+------+----------+-----+| 1.0  | 1.0  | 80000.0  | 4   || 2.0  | 1.0
 | 40000.0  | 3   || 4.0  | 1.0  | 40000.0  | 3   || 5.0  | 1.0  | 35000.0  | 2   || 6.0 
| 2.0  | 25000.0  | 4   |+------+------+----------+-----+it should have had the proper column
names. even from parquet schema it comes as bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar
schema /tmp/tmp/t/0_0_0.parquet message root {  optional double $0;  optional double $1; 
optional double $2;  required int64 $3;}But when we add order by clause in query it is adding
column names looks like an issue with storage writer. This is true for all cases whichever
file format we choose to store as for CTAS


> CTAS queries for window functions creating files without column names
> ---------------------------------------------------------------------
>
>                 Key: DRILL-5109
>                 URL: https://issues.apache.org/jira/browse/DRILL-5109
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill, Storage - Writer
>    Affects Versions: 1.8.0, 1.9.0
>            Reporter: Nitin
>
> Following query when executed,
> 0: jdbc:drill:zk=local> create table dfs.tmp.`/tmp/t` as select cast(employee_id as
double) employee_id, cast(department_id as double) department_id,cast(salary as double) salary,DENSE_RANK(
) over( partition by cast(department_id as double)  order by cast(salary as double) asc  nulls
first  )  dummy_DENSE_RANK from cp.`employee.json`;
> 0: jdbc:drill:zk=local> select * from dfs.tmp.`/tmp/t` limit 5;
> +------+------+----------+-----+
> |  $0  |  $1  |    $2    | $3  |+
> ------+------+----------+-----+
> | 1.0  | 1.0  | 80000.0  | 4   |
> | 2.0  | 1.0  | 40000.0  | 3   |
> | 4.0  | 1.0  | 40000.0  | 3   |
> | 5.0  | 1.0  | 35000.0  | 2   |
> | 6.0  | 2.0  | 25000.0  | 4   |
> +------+------+----------+-----+i
> t should have had the proper column names. even from parquet schema it comes as 
> bash-3.2$ java -jar parquet-tools-1.6.0rc4.jar schema /tmp/tmp/t/0_0_0.parquet 
> message root { 
>      optional double $0;  
>      optional double $1; 
>      optional double $2;  
>       required int64 $3;
> }
> But when we add order by clause in query it is adding column names looks like an issue
with storage writer. This is true for all cases whichever file format we choose to store as
for CTAS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message