flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Waury (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-1141) Selfjoin fails after DataSet exceeds certain size
Date Wed, 08 Oct 2014 11:01:33 GMT
Robert Waury created FLINK-1141:
-----------------------------------

             Summary: Selfjoin fails after DataSet exceeds certain size
                 Key: FLINK-1141
                 URL: https://issues.apache.org/jira/browse/FLINK-1141
             Project: Flink
          Issue Type: Bug
          Components: Local Runtime
    Affects Versions: 0.6.1-incubating
         Environment: LocalExecutionEnvironment (dop=4)
            Reporter: Robert Waury
            Priority: Minor


As soon as a DataSet exceeds a certain size (1000000 tuples in my example) a Selfjoin with
a FlatJoinFunction no longer works. After around a second the Join, DataSource and DataSink
threads are all in Wait and don't perform any work (no output files are created) and the job
never finishes.

If I cut the input size in half it works fine.

My current workaround is to create the DataSet twice and join the two identical DataSets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message