flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chiwan Park <chiwanp...@apache.org>
Subject Re: Discarding header from CSV file
Date Wed, 27 Apr 2016 02:22:37 GMT
Hi, Nirmalya

I recommend readCsvFile() method rather than readTextFile() to read CSV file. readCsvFile()
provides some features for CSV file such as ignoreFirstLine() (what you are looking for),
ignoreComments(), and etc.

If you have to use readTextFile() method, I think, you can ignore column headers by calling
zipWithIndex method and filtering it based on the index.

Regards,
Chiwan Park

> On Apr 27, 2016, at 10:32 AM, nsengupta <sengupta.nirmalya@gmail.com> wrote:
> 
> What is the recommended way of discarding the Column Header(s) from a CSV
> file, if I am using
> 
> /environment.readTextFile(....)
> /
> facility? Obviously, we don't know beforehand, which of the nodes will read
> the Header(s)? So, we cannot use usual tricks like drop(1)?
> 
> I don't recall well: has this been discussed and closed earlier in this
> forum? If so, can someone point that out to me please?
> 
> -- Nirmalya
> 
> 
> 
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Discarding-header-from-CSV-file-tp6474.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.


Mime
View raw message