commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Андрей <andreykuliko...@gmail.com>
Subject Re[2]: [csv] Two qoute characters in a row cause error
Date Wed, 11 Nov 2015 15:45:17 GMT
Thank you for reference to CSV standard, Jan.

There is no problem anymore :)

------ Исходное сообщение ------
От: "Jan Høydahl" <jan.asf@cominvent.com>
Кому: "Commons Users List" <user@commons.apache.org>; "Андрей" 
<andreykulikov95@gmail.com>
Отправлено: 11.11.2015 11:58:14
Тема: Re: [csv] Two qoute characters in a row cause error

>Which program generated your CSV file?
>Why do you expect it parsed to no quotes at all?
>
>According to https://tools.ietf.org/html/rfc4180#section-2 #2.5-7
>
>>     5.  Each field may or may not be enclosed in double quotes 
>>(however
>>         some programs, such as Microsoft Excel, do not use double 
>>quotes
>>         at all).  If fields are not enclosed with double quotes, then
>>         double quotes may not appear inside the fields.  For example:
>>
>>         "aaa","bbb","ccc" CRLF
>>         zzz,yyy,xxx
>>
>>     6.  Fields containing line breaks (CRLF), double quotes, and 
>>commas
>>         should be enclosed in double-quotes.  For example:
>>
>>         "aaa","b CRLF
>>         bb","ccc" CRLF
>>         zzz,yyy,xxx
>>
>>     7.  If double-quotes are used to enclose fields, then a 
>>double-quote
>>         appearing inside a field must be escaped by preceding it with
>>         another double quote.  For example:
>>
>>         "aaa","b""bb","ccc"
>
>According to this, both of these are valid
>
>2120000,1596,4240000,9600,true
>2120000,"1596",4240000,9600,true
>
>But your line seems to suggest that you want to escape a literal quote 
>as a double-quote, but then the field must be enclosed in double quotes 
>(have not tested this though).
>
>2120000,""”1596""",4240000,9600,true
>
>--
>Jan Høydahl, search solution architect
>Cominvent AS - www.cominvent.com
>
>>  11. nov. 2015 kl. 08.44 skrev Андрей <andreykulikov95@gmail.com>:
>>
>>  Hello everyone,
>>
>>  I've recently noticed a some bug, I suppose. I have the following csv 
>>file (2 lines):
>>
>>  DATA_SERVICE_ID,""DATA_SPEED_APID"",COMPONENT_ID,SPEED,SYNCHRONOUS
>>  2120000,""1596"",4240000,9600,true
>>
>>  (The first line is a header)
>>
>>  I'm trying to parse it via this configuration:
>>  CSVFormat format = 
>>CSVFormat.newFormat(',').withQuote('"').withHeader();
>>
>>  Expected result: all headers and values without quotes
>>  Actual result: Exception in thread "main" java.io.IOException: (line 
>>1) invalid char between encapsulated token and delimiter
>>  at 
>>org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
>>  at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
>>  at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:498)
>>  at 
>>org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:386)
>>  at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:283)
>>  at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:251)
>>  at Runner.main(Runner.java:20)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>  at 
>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>  at 
>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  at java.lang.reflect.Method.invoke(Method.java:497)
>>  at 
>>com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
>>
>>  Thanks ahead for any answers.
>>
>>  Kulikov Andrey
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message