drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramana Inukonda Nagaraj (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-1738) Parquet complex reader case sensitive
Date Mon, 17 Nov 2014 22:35:35 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ramana Inukonda Nagaraj updated DRILL-1738:
-------------------------------------------
    Description: 
On TPCDS parquet data while using the default reader drill is case sensitive for column names.

0: jdbc:drill:> select c_customer_sk from `0_0_0.parquet` limit 1;
+---------------+
| c_customer_sk |
+---------------+
| 1             |
+---------------+
1 row selected (0.15 seconds)
0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
+---------------+
| c_customer_sk |
+---------------+
| 1             |
+---------------+

On using the new reader though
0: jdbc:drill:> alter session set `store.parquet.use_new_reader`=true;


0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
| c_customer_sk | c_customer_id | c_current_cdemo_sk | c_current_hdemo_sk | c_current_addr_sk
| c_first_shipto_date_sk | c_first_sales_date_sk | c_salutation | c_first_name | c_last_name
| c_preferred_cust_flag | c_birth_day | c_birth_month | c_birth_year | c_birth_country | 
c_login   | c_email_address | c_last_review_date |
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
| 1             | AAAAAAAABAAAAAAA | 980124             | 7135               | 32946     
       | 2452238                | 2452208               | Mr.          | Javier       | Lewis
      | Y                     | 9           | 12            | 1936         | CHILE       
   | null       | Javier.Lewis@VFAxlnZEvOx.org | 2452508            |
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
1 row selected (0.368 seconds)

Will file a separate bug for the issue that when the new parquet reader cannot find a column
it does a * query instead and returns all columns. 

  was:
On TPCDS parquet data while using the default reader drill is case insensitive for column
names.

0: jdbc:drill:> select c_customer_sk from `0_0_0.parquet` limit 1;
+---------------+
| c_customer_sk |
+---------------+
| 1             |
+---------------+
1 row selected (0.15 seconds)
0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
+---------------+
| c_customer_sk |
+---------------+
| 1             |
+---------------+

On using the new reader though
0: jdbc:drill:> alter session set `store.parquet.use_new_reader`=true;


0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
| c_customer_sk | c_customer_id | c_current_cdemo_sk | c_current_hdemo_sk | c_current_addr_sk
| c_first_shipto_date_sk | c_first_sales_date_sk | c_salutation | c_first_name | c_last_name
| c_preferred_cust_flag | c_birth_day | c_birth_month | c_birth_year | c_birth_country | 
c_login   | c_email_address | c_last_review_date |
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
| 1             | AAAAAAAABAAAAAAA | 980124             | 7135               | 32946     
       | 2452238                | 2452208               | Mr.          | Javier       | Lewis
      | Y                     | 9           | 12            | 1936         | CHILE       
   | null       | Javier.Lewis@VFAxlnZEvOx.org | 2452508            |
+---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
1 row selected (0.368 seconds)

Will file a separate bug for the issue that when the new parquet reader cannot find a column
it does a * query instead and returns all columns. 


> Parquet complex reader case sensitive
> -------------------------------------
>
>                 Key: DRILL-1738
>                 URL: https://issues.apache.org/jira/browse/DRILL-1738
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>            Reporter: Ramana Inukonda Nagaraj
>            Priority: Blocker
>
> On TPCDS parquet data while using the default reader drill is case sensitive for column
names.
> 0: jdbc:drill:> select c_customer_sk from `0_0_0.parquet` limit 1;
> +---------------+
> | c_customer_sk |
> +---------------+
> | 1             |
> +---------------+
> 1 row selected (0.15 seconds)
> 0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
> +---------------+
> | c_customer_sk |
> +---------------+
> | 1             |
> +---------------+
> On using the new reader though
> 0: jdbc:drill:> alter session set `store.parquet.use_new_reader`=true;
> 0: jdbc:drill:> select c_customer_SK from `0_0_0.parquet` limit 1;
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> | c_customer_sk | c_customer_id | c_current_cdemo_sk | c_current_hdemo_sk | c_current_addr_sk
| c_first_shipto_date_sk | c_first_sales_date_sk | c_salutation | c_first_name | c_last_name
| c_preferred_cust_flag | c_birth_day | c_birth_month | c_birth_year | c_birth_country | 
c_login   | c_email_address | c_last_review_date |
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> | 1             | AAAAAAAABAAAAAAA | 980124             | 7135               | 32946
            | 2452238                | 2452208               | Mr.          | Javier     
 | Lewis       | Y                     | 9           | 12            | 1936         | CHILE
          | null       | Javier.Lewis@VFAxlnZEvOx.org | 2452508            |
> +---------------+---------------+--------------------+--------------------+-------------------+------------------------+-----------------------+--------------+--------------+-------------+-----------------------+-------------+---------------+--------------+-----------------+------------+-----------------+--------------------+
> 1 row selected (0.368 seconds)
> Will file a separate bug for the issue that when the new parquet reader cannot find a
column it does a * query instead and returns all columns. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message