flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: CsvInputFormat delimiter fields
Date Wed, 15 Oct 2014 13:47:35 GMT
Hi!

The reason is the current way the csv parsers work. They are pushed into
the byte stream parsing and are restricted to recognize one char
delimiters. It is possible to change that, but would be a bit of work.

Stephan

On Wed, Oct 15, 2014 at 3:36 PM, Martin Neumann <mneumann@spotify.com>
wrote:

> Hej,
>
> A lot of my inputs are csv files so I use the CsvInputFormat a lot. What I
> find kind of odd that the Line delimiter is a String but the Field
> delimiter is a Character.
>
> *see:* new CsvInputFormat<Tuple2<String,String>>(new
> Path(pVecPath),"\n",'\t',String.class,String.class)
>
> Is there a reason for this? I'm currently working with a file that has a
> more complex field delimiter so I had to write a mapper to read from
> StringInputFormat.
>
> cheers Martin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message