flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Malte Schwarzer ...@mieo.de>
Subject Re: Quotes in fields of CsvInputFormat
Date Fri, 05 Dec 2014 15:44:30 GMT
Hi Stephan,

The result should be >"hhh³ xx<  as field value. Enclosures should be
disabled but there seems to be no method to do that.


Malte

Von:  Stephan Ewen <sewen@apache.org>
Antworten an:  <user@flink.incubator.apache.org>
Datum:  Freitag, 5. Dezember 2014 16:28
An:  <user@flink.incubator.apache.org>
Betreff:  Re: Quotes in fields of CsvInputFormat

Hi!

The parser interprets the quotes as quotes for the field. That means the
second field (the string) stops after the "hhh" and the xx is considered
invalid trailing data.

What do you expect as the result of parsing that line?

Stephan


On Fri, Dec 5, 2014 at 4:16 PM, Malte Schwarzer <ms@mieo.de> wrote:
> Hi,
> 
> I¹m try to import a CSV file but the parser seems to have problems this quotes
> in the beginning of a field. Is there a way to set or disable enclosures for
> the CSV input?
> 
> This is my  code:
> 
> DataSet<Tuple2<String, String>> res = env.readCsvFile(inputCsvFilename)
>                 .fieldDelimiter('|')
>                 .types(String.class, String.class)
> 
> CSV:
> 
> A|ggg
> B|"hhh" xx
> C|xxx
> 
> As result I¹m receiving a ParserException for line B:
> 
> org.apache.flink.api.common.io.ParseException: Line could not be parsed:
> 'B|"hhh" xxŒ
> 
> 
> Thanks,
> Malte




Mime
View raw message