drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sekhon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-1712) Quoted CSV parsing
Date Fri, 14 Nov 2014 11:40:34 GMT
Hari Sekhon created DRILL-1712:
----------------------------------

             Summary: Quoted CSV parsing
                 Key: DRILL-1712
                 URL: https://issues.apache.org/jira/browse/DRILL-1712
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 0.6.0
         Environment: MapR 4.0.1 M5
            Reporter: Hari Sekhon


When querying CSV files Drill doesn't handle quoted CSV files properly and includes the quotes
in the data. The directory /tmp/hari in MapR-FS has two simple CSV files,  one quoted, one
not quoted so you can see the difference.
{code}
0: jdbc:drill:> select * from dfs.`/tmp/hari` limit 10;
+------------+
|  columns   |
+------------+
| ["1","2","3"] |
| ["4","5","6"] |
| ["7","8","9"] |
| ["\"1\"","\"2\"","\"3\""] |
| ["\"4\"","\"5\"","\"6\""] |
| ["\"7\"","\"8\"","\"9\""] |
+------------+
6 rows selected (0.238 seconds)

 cat hari/hari.csv
1,2,3
4,5,6
7,8,9
cat hari/hari2.csv
"1","2","3"
"4","5","6"
"7","8","9"
{code}
It shouldn't be including the quotes as data, they're just containers to the data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message