flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gábor Gévay <gga...@gmail.com>
Subject Re: Iteration Intermediate Output
Date Mon, 30 May 2016 12:31:03 GMT
Hello,

> Would the best way be to extend the iteration operators to support
> intermediate outputs or revisit the idea of caching intermediate results
> and thus allow efficient for-loop iterations?

Caching intermediate results would also help a lot to projects that
are targeting Flink as a backend, like Emma [1] and SystemML [2]. The
issue here is that these languages allow writing more general
iterations (general control flow (nested loops, ifs in the loop body),
multiple "solution sets", doing something else with the intermediate
results, etc.), that can't be translated to Flink's iteration
constructs. So these systems currently don't have much better options
than just writing intermediate results to files, which is not so nice.

Best,
Gabor

[1] http://www.user.tu-berlin.de/asteriosk/assets/publications/emma-sigmod2015.pdf
[2] https://systemml.apache.org/



2016-05-28 13:48 GMT+02:00 Vasiliki Kalavri <vasilikikalavri@gmail.com>:
> Hey,
>
> it would be great to add this feature indeed! Thanks for bringing it up
> Greg :)
> Would the best way be to extend the iteration operators to support
> intermediate outputs or revisit the idea of caching intermediate results
> and thus allow efficient for-loop iterations?
>
> -Vasia.
>
> On 26 May 2016 at 22:41, Greg Hogan <code@greghogan.com> wrote:
>
>> Hi y'all,
>>
>> I think this is an oft-requested feature [0] and there are many graph
>> algorithms for which intermediate output is the desired result. I'd like to
>> take Stephan up on his offer [1] for pointers.
>>
>> I have yet to get in deep, but I see that iteration tasks are treated
>> specially as IterationIntermediateTask for synchronization between
>> supersteps. Also, when OperatorTranslation and GraphCreatingVisitor are
>> walking the program DAG an iteration must be first reached through the
>> tail.
>>
>> Greg
>>
>> [0]
>>
>> http://stackoverflow.com/questions/37224140/possibility-of-saving-partial-outputs-from-bulk-iteration-in-flink-dataset
>> [1]
>>
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Intermediate-output-during-delta-iterations-td436.html
>>

Mime
View raw message