flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengxiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-2396) Review the datasets of dynamic path and static path in iteration.
Date Thu, 23 Jul 2015 10:21:05 GMT
Chengxiang Li created FLINK-2396:
------------------------------------

             Summary: Review the datasets of dynamic path and static path in iteration.
                 Key: FLINK-2396
                 URL: https://issues.apache.org/jira/browse/FLINK-2396
             Project: Flink
          Issue Type: Improvement
          Components: Core
            Reporter: Chengxiang Li
            Priority: Minor


Currently Flink would cached dataset in static path as it assumes that dataset stay the same
during the iteration, but this assumption does not always be true. Take sampling for example,
the iteration data set is something like the weight vector of model and there is another training
dataset from which to take a small sample to update the weight vector in each iteration (e.g.
Stochastic Gradient Descent), we expect sampled dataset is different in each iteration, but
Flink would cache the sampled dataset as it in static path. 
We should review how Flink identify dynamic path and static path, and support add sampled
dataset in above example to dynamic path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message