flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Metzger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1170) Localization of InputSplits is not working properly
Date Fri, 17 Oct 2014 17:38:34 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175285#comment-14175285
] 

Robert Metzger commented on FLINK-1170:
---------------------------------------

I found the issue while running a very simple "distributed grep" job that is just reading
a lot of data, filtering it for a certain string.
I had 1 TB of input data on a 24 nodes cluster. 
The runtime was very bad with the issue (~1 hour), after the fix, I've got it down to less
than 4 minutes.
Flink and HDFS seem to use different hostname-representations. While hdfs was just using "worker1",
Flink was using the full hostname ("worker1.hdcluster.company.com"). This caused the input
splits to be assigned randomly, not local to the actual data.

After the fix, the data has been read locally most of the time (without costy network IO).

> Localization of InputSplits is not working properly
> ---------------------------------------------------
>
>                 Key: FLINK-1170
>                 URL: https://issues.apache.org/jira/browse/FLINK-1170
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Runtime
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>
> While running some benchmarks, I found that Flink is not properly assigning the InputSplits.
> On my testing cluster, ALL splits were assigned to remote HDFS DataNodes, which causes
a lot of network I/O.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message