flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pritam Sadhukhan <sadhukhan.pri...@gmail.com>
Subject Data processing with HDFS local or remote
Date Fri, 18 Oct 2019 02:59:40 GMT

I am trying to process data stored on HDFS using flink batch jobs.
Our data is splitted into 16 data nodes.

I am curious to know how data will be pulled from the data nodes with the
same number of parallelism set as the data split on HDFS i.e. 16.

Is the flink task being executed locally on the data node server or it will
happen in the flink nodes where data will be pulled remotely?

Any help will be appreciated.


View raw message