Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D5E591790C for ; Mon, 13 Apr 2015 21:05:12 +0000 (UTC) Received: (qmail 40912 invoked by uid 500); 13 Apr 2015 21:05:12 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 40883 invoked by uid 500); 13 Apr 2015 21:05:12 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 40873 invoked by uid 99); 13 Apr 2015 21:05:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Apr 2015 21:05:12 +0000 Date: Mon, 13 Apr 2015 21:05:12 +0000 (UTC) From: "Steven Phillips (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-2760) Quoted strings from CSV file appear in query output in different forms MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2760: ----------------------------------- Fix Version/s: 1.0.0 > Quoted strings from CSV file appear in query output in different forms > ---------------------------------------------------------------------- > > Key: DRILL-2760 > URL: https://issues.apache.org/jira/browse/DRILL-2760 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text & CSV > Affects Versions: 0.9.0 > Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT > 4 node cluster on CentOS > Reporter: Khurram Faraaz > Assignee: Steven Phillips > Fix For: 1.0.0 > > > Quoted strings appear in query output in different forms, as shown in the section below. > Quotes should NOT appear in query output. Strings must be stripped of their leading and prevailing quotes. (I am referring to this character - " ) > {code} > Snippet of data from airports.cv file, first three lines, the first line has header information. > [root@centos-01 airport_CSV_data]# head -3 airports.csv > "id","ident","type","name","latitude_deg","longitude_deg","elevation_ft","continent","iso_country","iso_region","municipality","scheduled_service","gps_code","iata_code","local_code","home_link","wikipedia_link","keywords" > 6523,"00A","heliport","Total Rf Heliport",40.07080078125,-74.9336013793945,11,"NA","US","US-PA","Bensalem","no","00A",,"00A",,, > 6524,"00AK","small_airport","Lowell Field",59.94919968,-151.695999146,450,"NA","US","US-AK","Anchor Point","no","00AK",,"00AK",,, > case 1) In this case quotes are not escaped, they appear in the output as is. > 0: jdbc:drill:> select columns[0] id,columns[1] ident,columns[2] type,columns[3] name,columns[4] latitude_deg,columns[5] longitude_deg,columns[6] elevation_ft,columns[7] continent,columns[8] iso_country,columns[9] iso_region,columns[10] municipality,columns[11] scheduled_service,columns[12] gps_code,columns[13] iata_code, columns[14] local_code,columns[15] home_link,columns[16] wikipedia_link,columns[17] keywords from `airports.csv` limit 3; > +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+ > | id | ident | type | name | latitude_deg | longitude_deg | elevation_ft | continent | iso_country | iso_region | municipality | scheduled_service | gps_code | iata_code | local_code | home_link | wikipedia_link | keywords | > +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+ > | "id" | "ident" | "type" | "name" | "latitude_deg" | "longitude_deg" | "elevation_ft" | "continent" | "iso_country" | "iso_region" | "municipality" | "scheduled_service" | "gps_code" | "iata_code" | "local_code" | "home_link" | "wikipedia_link" | "keywords" | > | 6523 | "00A" | "heliport" | "Total Rf Heliport" | 40.07080078125 | -74.9336013793945 | 11 | "NA" | "US" | "US-PA" | "Bensalem" | "no" | "00A" | | "00A" | | | null | > | 6524 | "00AK" | "small_airport" | "Lowell Field" | 59.94919968 | -151.695999146 | 450 | "NA" | "US" | "US-AK" | "Anchor Point" | "no" | "00AK" | | "00AK" | | | null | > +------------+------------+------------+------------+--------------+---------------+--------------+------------+-------------+------------+--------------+-------------------+------------+------------+------------+------------+----------------+------------+ > 3 rows selected (0.155 seconds) > In this case quotes appear in the query output but they are escaped with backslash character in the output. > 0: jdbc:drill:> select * from `airports.csv` limit 3; > +------------+ > | columns | > +------------+ > | ["\"id\"","\"ident\"","\"type\"","\"name\"","\"latitude_deg\"","\"longitude_deg\"","\"elevation_ft\"","\"continent\"","\"iso_country\"","\"iso_region\"","\"municipality\"","\"scheduled_service\"","\"gps_code\"","\"iata_code\"","\"local_code\"","\"home_link\"","\"wikipedia_link\"","\"keywords\""] | > | ["6523","\"00A\"","\"heliport\"","\"Total Rf Heliport\"","40.07080078125","-74.9336013793945","11","\"NA\"","\"US\"","\"US-PA\"","\"Bensalem\"","\"no\"","\"00A\"","","\"00A\"","",""] | > | ["6524","\"00AK\"","\"small_airport\"","\"Lowell Field\"","59.94919968","-151.695999146","450","\"NA\"","\"US\"","\"US-AK\"","\"Anchor Point\"","\"no\"","\"00AK\"","","\"00AK\"","",""] | > +------------+ > 3 rows selected (0.097 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)