commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CSV-58) Unescape handling needs rethinking
Date Tue, 17 Jun 2014 22:15:19 GMT

    [ https://issues.apache.org/jira/browse/CSV-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034470#comment-14034470
] 

Sebb commented on CSV-58:
-------------------------

I now think that only meta-characters (and the record-separator) should be unescaped, because
only meta-characters need to be unescaped on output. All other escapes should be left as-is,
and should be handled separately (probably by the application).

However, this may cause some issues with multi-char record separators - needs further investigation.
More complications may occur if the RS can be specified as a list of strings.
It may be necessary to restrict the RS to a single string.

> Unescape handling needs rethinking
> ----------------------------------
>
>                 Key: CSV-58
>                 URL: https://issues.apache.org/jira/browse/CSV-58
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>            Reporter: Sebb
>             Fix For: Patch Needed, 1.0
>
>         Attachments: commons-csv.diff
>
>
> The current escape parsing converts <esc><char> to plain <char> if
the <char> is not one of the special characters to be escaped.
> This can affect unicode escapes if the <esc> character is backslash.
> One way round this is to specifically check for <char> == 'u', but it seems wrong
to only do this for 'u'.
> Another solution would be to leave <esc><char> as is unless the <char>
is one of the special characters.
> There are several possible ways to treat unrecognised escapes:
> - treat it as if the escape char had not been present (current behaviour)
> - leave the escape char as is
> - throw an exception



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message