hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5999) Allow other characters for LINES TERMINATED BY
Date Wed, 30 Sep 2015 12:20:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936769#comment-14936769
] 

Sameer Gupta commented on HIVE-5999:
------------------------------------

row_format
  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY
char]
        [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
        [NULL DEFINED AS char]   -- (Note: Available in Hive 0.13 and later)
  | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value,
...)]

The   [LINES TERMINATED BY char] option in the create DDL statement in Apache Hive in this
case essentially means nothingas anything other than '\n' is not supported. Also, numorous
times, we cannot change the data coming from source, the source is source. Also, for table
import to hive where a column may contain free form text, a '\n' character is fairly common
in data. So we need to specify a different row delimiter. This is such a basic functionality
for any DB. Iamgeine my surprise when looking at the create statement, i propose a business
solution only to know later that its a gimmic.

> Allow other characters for LINES TERMINATED BY 
> -----------------------------------------------
>
>                 Key: HIVE-5999
>                 URL: https://issues.apache.org/jira/browse/HIVE-5999
>             Project: Hive
>          Issue Type: Improvement
>          Components: Beeline, Database/Schema, Hive
>    Affects Versions: 0.12.0
>            Reporter: Mariano Dominguez
>            Assignee: Ashutosh Chauhan
>            Priority: Critical
>              Labels: Delimiter, Hive, Row, SerDe
>
> LINES TERMINATED BY only supports newline '\n' right now.
> It would be nice to loosen this constraint and allow other characters.
> This limitation seems to be hardcoded here:
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java#L171



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message