commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tillmann Gaida (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CSV-35) Escaped line separators are not supported
Date Mon, 30 Jun 2014 08:36:24 GMT

    [ https://issues.apache.org/jira/browse/CSV-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047460#comment-14047460
] 

Tillmann Gaida edited comment on CSV-35 at 6/30/14 8:34 AM:
------------------------------------------------------------

I added a patch "commons-csv CSV-35 escapeCRLFOnce[ test].patch", which introduces a CSVFormat
setting "escapeCRLFOnce", which enables the desired behaviour in Lexer. It is false by default
and I did not change CSVFormat.MYSQL, which might be approprate. I am not exactly happy with
the naming of the setting. Consider renaming it if you happen to build upon the patch.

EDIT: clarity

EDIT: This is a very specific setting. A cleaner solution would probably be to allow escaping
of record separators by a single escape char. However it appears that the MYSQL format uses
LF as a record separator, so we would need to have multiple record separators, which in this
case would not be actual record separators.

I'd argue that CRLF is special enough to have an individual setting, but I would also agree
with having a cleaner CSVFormat. The only real alternative would be having a way to individually
specify character sequences and a replacement if they are preceded by the escape char.


was (Author: tillmann gaida):
I added a patch "commons-csv CSV-35 escapeCRLFOnce[ test].patch", which introduces a CSVFormat
setting "escapeCRLFOnce", which enables the desired behaviour in Lexer. It is false by default
and I did not change CSVFormat.MYSQL, which might be approprate. I am not exactly happy with
the naming of the setting. Consider renaming it if you happen to build upon the patch.

EDIT: clarity

> Escaped line separators are not supported
> -----------------------------------------
>
>                 Key: CSV-35
>                 URL: https://issues.apache.org/jira/browse/CSV-35
>             Project: Commons CSV
>          Issue Type: Bug
>            Reporter: Emmanuel Bourg
>             Fix For: 1.0
>
>         Attachments: CSV-35.patch, commons-csv CSV-35 escapeCRLFOnce test.patch, commons-csv
CSV-35 escapeCRLFOnce.patch, mysql-export-line-terminated-by-crlf.csv, mysql-export-line-terminated-by-lf.csv
>
>
> Commons CSV doesn't handle escaped line separators, for example:
> {code}
> value1;value2;value3a\
> value3b
> {code}
> In this case the expected result is:
> {code}["value1", "value2", "value3a\nvalue3b"]{code}
> This kind of escaping is produced by MySQL, whether the field enclosing is enabled or
not. It's possible to see enclosing quotes and escaped line separators like this:
> {code}
> "value1";"value2";"value3a\
> value3b"
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message