flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chesnay Schepler <ches...@apache.org>
Subject Re: Parallel read text
Date Sat, 28 May 2016 09:52:42 GMT
ExecutionEnvironment.readTextFile will read the file in parallel.

On 28.05.2016 09:59, David Olsen wrote:
> After searching on the internet I still do not find the answer (with 
> key word like 'apache flink parallel read text') I am looking for. So 
> asking here before jumping to write code ...
> My problem is I want to a read text file or split text files (from 
> local file system). Therefore I want to parallel read those files and 
> process them accordingly.
> From what I discover so far:
> - Use ExecutionEnvironment.readTextFile but this only serves with 1 
> thread(?) (meaning reading the file(s) from the beginning to the end)
> - Use streaming env to addSource[1] but that seems to me I need to 
> implement my own source with RichParallelSourceFunction.
> Is there any classes or impl that already can read text in parallel?
> Thanks
> [1]. 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Reading-separate-files-in-parallel-tasks-as-input-td1623.html

View raw message