flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Non blocking operation in Apache flink
Date Wed, 25 May 2016 15:02:33 GMT
I see what you mean now. The Akka Streams API is very interesting, in how
they allow async calls.

For Flink, I think you could implement it as a custom source that listens
for the change stream, starts futures to get data from the database and
emits elements when the future completes. I quickly sketched such an
approach:


public static class MyDBSource implements ParallelSourceFunction<String> {
    private static final long serialVersionUID = 1L;

    private volatile boolean running = true;

    @Override
    public void run(final SourceContext<String> ctx) throws Exception {
        ChangelogConnection log = new ChangelogConnection();
        DB db = new DB();

        final Object checkpointLock = ctx.getCheckpointLock();

        while (running) {
            // try and fetch next changelog item
            Change change = log.getNextChange();

            DB.fetch(change, new Future() {
                public void complete(String data) {
                    synchronized (checkpointLock) {
                        ctx.collect(data);
                    }
                }
            });
        }
    }

    @Override
    public void cancel() {
        running = false;
    }
}

I hope that helps.

-Aljoscha

On Wed, 25 May 2016 at 12:21 Maatary Okouya <maatariokouya@gmail.com> wrote:

> Maybe the following can illustrate better what i mean
> http://doc.akka.io/docs/akka/2.4.6/scala/stream/stream-integrations.html#Integrating_with_External_Services
>
> On Wed, May 25, 2016 at 5:16 AM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
>> Hi,
>> there is no functionality to have asynchronous calls in user functions in
>> Flink.
>>
>> The asynchronous action feature in Spark is also not meant for such
>> things, it is targeted at programs that need to pull all data to the
>> application master. In Flink this is not necessary because you can specify
>> a whole plan of operations before executing them.
>>
>> Cheers,
>> Aljoscha
>>
>> On Tue, 24 May 2016 at 20:43 Maatary Okouya <maatariokouya@gmail.com>
>> wrote:
>>
>>> I'm looking for a way to avoid thread starvation in my tasks, by
>>> returning future but i don't see how is that possible.
>>>
>>> Hence i would like to know, how flink handle the case where in your job
>>> you have to perform network calls (I use akka http or spray) or any IO
>>> operation and use the result of it.
>>>
>>> In sparks i see asynchronous action and so on. I don't see any
>>> equivalent in apache flink. How does it works ? is it supported, or the
>>> network call and any io operation have to be synchronous ?
>>>
>>> any help, indication, reads and so on would be appreciated
>>>
>>
>

Mime
View raw message