commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Gäckle (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CSV-222) invalid char between encapsulated token and delimiter
Date Tue, 03 Apr 2018 08:26:00 GMT

    [ https://issues.apache.org/jira/browse/CSV-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419448#comment-16419448
] 

Patrick Gäckle edited comment on CSV-222 at 4/3/18 8:25 AM:
------------------------------------------------------------

This is the option I'd like to use but how can I set them to these non printable characters?
Maybe it would be nice to include the position in the log statement as another hint where
to search.

I'd really would like to see some option to just leave characters not identified as in colum
aside.


was (Author: lostkatana):
This is the current workaround  I use.
Maybe it would be nice to include the position in the log statement as another hint where
to search.

I'd really would like to see some option to just leave characters not identified as in colum
aside.

> invalid char between encapsulated token and delimiter
> -----------------------------------------------------
>
>                 Key: CSV-222
>                 URL: https://issues.apache.org/jira/browse/CSV-222
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Parser
>    Affects Versions: 1.4
>            Reporter: Patrick Gäckle
>            Priority: Major
>         Attachments: faulty.csv
>
>
> When trying to read the file [^faulty.csv] and parse it I get the following error:
> {code}
> java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
> 	at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
> 	at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
> 	at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
> 	at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
> 	at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
> 	at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
> {code}
> The line of code is the parsing part returning the iterator of it:
> {code:java}
> csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
> iterator = csvFormat.parse(reader).iterator();
> {code}
> The invalid char is the contained SOH and STX non printable characters at the end of
line.
> I debugged through the source of this and ran into the Exception in the Lexer not handling
these special characters
> Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with
these type of characters and what behaviour they should have.
> Sincerely



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message