flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Dataset read csv file problem
Date Fri, 24 Nov 2017 12:35:15 GMT
Hi Ebru,

this case is not supported by Flink's CsvInputFormat. The problem is that
such a file could not be read in parallel because it is not possible to
identify record boundaries if you start reading in the middle of the file.
We have a new CsvInputFormat under development that follows the RFC 4180
standard which will have an parameter to support row delimiters that are
encapsulated in a String field.

Until that is available, the only solution is to implement a custom
InputFormat.

Best, Fabian

2017-11-24 11:40 GMT+01:00 ebru <b20926247@cs.hacettepe.edu.tr>:

> Hello all,
>
> We are trying to read csv files which contains fields containing  \n
> character, also \n character is line delimiter. We used
> parseQuotedStrings('\"')
>  Method but, it ignores only field delimiters so we couldn’t parse the
> fields that contains \n character. How can we solve this problem?
>
> -Ebru
>

Mime
View raw message