hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lefty Leverenz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14632) beeline outputformat needs better documentation
Date Sun, 09 Oct 2016 06:17:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15559375#comment-15559375
] 

Lefty Leverenz commented on HIVE-14632:
---------------------------------------

Good documentation, thanks [~kuczoram]!  I made some minor edits.  Very cool expandable examples
-- I hadn't realized we can do that.

One question:  is the misalignment of the 'comment' column in the tsv example accurate?  I
assume it's due to the tab stops because the 'value' column has values longer than the column
name, but just wanted to check.

+1 but a technical review by [~michaelthoward] or [~thejas] would also be good.

> beeline outputformat needs better documentation
> -----------------------------------------------
>
>                 Key: HIVE-14632
>                 URL: https://issues.apache.org/jira/browse/HIVE-14632
>             Project: Hive
>          Issue Type: Improvement
>          Components: Beeline
>    Affects Versions: 0.14.0
>         Environment: Hive HiveServer2 wiki
>            Reporter: Michael Howard
>            Assignee: Marta Kuczora
>
> SUMMARY
> * need better wiki page doc for beeline outputformat option
> * should explicitly say that "double quote characters" are used to enclose fields which
need enclosing. 
> * Should describe the treatment of embedded double quote chars as "doubled"
> DETAIL
> The page at:
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Separated-ValueOutputFormats
> describes separated value outputformats csv/tsv/csv2/tsv2, etc. 
> I found doc to be inadequate and terminology to be confusing. 
> > These conform better to standard CSV convention, which adds quotes around a cell
value 
> What kind of quotes? The only reference to quotes in this section refers to single quotes
for the deprecated csv/tsv format. 
> The JIRA at 
> https://issues.apache.org/jira/browse/HIVE-8615
> clarifies a bit:
> - Old format quoted every field. New format quotes only fields that contain a delimiter
or the quoting char. 
> - Old format quoted using single quotes, new format quotes using double quotes 
> - Old format didn't escape quotes in a field (a bug). New format does escape the quotes
> However, neither this JIRA page nor the wiki page doc define what is meant by "escaping
the quotes". 
> Q: In this context, does escaping mean "backslash escaping" or "double embedded double
quotes" or something else? 
> Investigation of source code reveals that this is using SuperCSV. 
> SuperCSV does not support backslash-escape of embedded quotes. See last line of:
> https://super-csv.github.io/super-csv/csv_specification.html
> THE END



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message