From user-return-32008-archive-asf-public=cust-asf.ponee.io@flink.apache.org Thu Jan 16 16:21:04 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 29D2818060E for ; Thu, 16 Jan 2020 17:21:04 +0100 (CET) Received: (qmail 91251 invoked by uid 500); 16 Jan 2020 16:20:20 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 84106 invoked by uid 99); 16 Jan 2020 16:17:44 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.159) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jan 2020 16:17:44 +0000 Received: from [30.61.70.86] (unknown [37.44.7.172]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id D9C7B1005; Thu, 16 Jan 2020 16:17:43 +0000 (UTC) Subject: Re: Read CSV file and and create customized field To: Soheil Pourbafrani , user References: From: Chesnay Schepler Message-ID: <46cb3f3d-39b3-733c-ca9d-0a7b06053e67@apache.org> Date: Thu, 16 Jan 2020 17:17:42 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------FDA208EBAC052BC3FA02DF1F" Content-Language: en-US This is a multi-part message in MIME format. --------------FDA208EBAC052BC3FA02DF1F Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit You should add an extra map function. On 16/01/2020 17:10, Soheil Pourbafrani wrote: > Hi friends, > I'm going to read a CSV file that has 3 columns. I want the final > loaded datatype to have other columns inferred by that 3 columns. > For example, I would split the first column of the CSV file and create > 3 new columns. > The problem is I did not find a straightforward approach for that. > Here is what I have so far: > env.readCsvFile("pathToCsv") > .fieldDelimiter(",") > .ignoreFirstLine() > .ignoreInvalidLines() > .type(String.class, String.class, String.class) > .print(); > So is there any way to show the readCSVFile how to split CSV records > or I should add an extra map function after loading CSV to create my > desired schema? > > Thanks --------------FDA208EBAC052BC3FA02DF1F Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit

You should add an extra map function.

On 16/01/2020 17:10, Soheil Pourbafrani wrote:

Hi friends,

I'm going to read a CSV file that has 3 columns. I want the final loaded datatype to have other columns inferred by that 3 columns. For example, I would split the first column of the CSV file and create 3 new columns.

The problem is I did not find a straightforward approach for that. Here is what I have so far:
env.readCsvFile("pathToCsv")
        .fieldDelimiter(",")
        .ignoreFirstLine()
        .ignoreInvalidLines()
        .type(String.class, String.class, String.class)
        .print();
So is there any way to show the readCSVFile how to split CSV records or I should add an extra map function after loading CSV to create my desired schema?

Thanks

--------------FDA208EBAC052BC3FA02DF1F--