commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CSV-67) UnicodeUnescapeReader should not be applied before parsing
Date Fri, 16 Mar 2012 00:28:40 GMT
UnicodeUnescapeReader should not be applied before parsing
----------------------------------------------------------

                 Key: CSV-67
                 URL: https://issues.apache.org/jira/browse/CSV-67
             Project: Commons CSV
          Issue Type: Bug
            Reporter: Sebb


The UnicodeEscapeReader is currently applied before the input file is parsed.

This means that unicode escapes are treated differently from other escapes.

For example, the sequence <esc>r<esc>n is not treated as a new-line for the purpose
of recognising the end of a record, yet \o000D\u000A is converted to CRLF and would terminate
the record (unless embedded in a quoted string).

The unicode escape processing (if selected) should occur as part of the parsing, just as for
ordinary escape processing.

The class can be made public so the user can wrap the input if required; this preserves the
existing functionality should it be required, so there is no need to introduce another setting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message