commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Gregory (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CSV-141) Handle malformed CSV files
Date Mon, 10 Nov 2014 03:17:33 GMT

    [ https://issues.apache.org/jira/browse/CSV-141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14204292#comment-14204292
] 

Gary Gregory commented on CSV-141:
----------------------------------

So this is really being lenient in one special case, when there is a missing end-of-field
marker at the end of a line.

Well, yes, to echo Emmanuel above, patches welcome ;-)

In the meantime, I plan on creating a 1.1 release candidate this week. We can continue this
review for a 1.1.1 or 1.2 or anytime a patch comes in.

Gary

> Handle malformed CSV files
> --------------------------
>
>                 Key: CSV-141
>                 URL: https://issues.apache.org/jira/browse/CSV-141
>             Project: Commons CSV
>          Issue Type: Wish
>          Components: Parser
>    Affects Versions: 1.0
>            Reporter: Nguyen Minh
>            Priority: Minor
>             Fix For: 1.x
>
>
> My java application has to handle thousands of CSV files uploaded by the client phones
everyday. So, there some CSV files have the wrong format which I'm not sure why.
> Here is my sample CSV. Microsoft Excel parses it correctly, but both Common CSV and OpenCSV
can't parse it. Open CSV can't parse line 2 (due to '\' character) and Common CSV will crash
on line 3 and 4:
> "1414770317901","android.widget.EditText","pass sem1 _84*|*","0","pass sem1 _8"
> "1414770318470","android.widget.EditText","pass sem1 _84:*|*","0","pass sem1 _84:\"
> "1414770318327","android.widget.EditText","pass sem1 
> "1414770318628","android.widget.EditText","pass sem1 _84*|*","0","pass sem1
> Line 3: java.io.IOException: (line 5) invalid char between encapsulated token and delimiter
> 	at org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398)
> 	at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407)
> Line 4: java.io.IOException: (startline 5) EOF reached before encapsulated token finished
> 	at org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398)
> 	at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message