hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brock Noland (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3065) New lines in columns can cause problems even when using sequence files
Date Sun, 30 Mar 2014 16:10:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954733#comment-13954733
] 

Brock Noland commented on HIVE-3065:
------------------------------------

I believe the issue is that Hive encodes the data as delimited text inside the file format.

> New lines in columns can cause problems even when using sequence files
> ----------------------------------------------------------------------
>
>                 Key: HIVE-3065
>                 URL: https://issues.apache.org/jira/browse/HIVE-3065
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.7.1, 0.8.1
>            Reporter: Joey Echeverria
>
> When using sequence files as the container format, I'd expect to be able to embed new
lines in a column. However, this causes problems when the data is output if the newlines aren't
manually stripped or escaped. This tends to show up as each row of output generating two (or
more) rows with nulls after the column with a new line and nulls for the "empty" columns on
the second row.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message