apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pramod Immaneni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (APEXCORE-570) Prevent upstream operators from getting too far ahead when downstream operators are slow
Date Sun, 15 Jan 2017 22:09:26 GMT

    [ https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823292#comment-15823292

Pramod Immaneni commented on APEXCORE-570:

I will go into the topic specific comments first and then address your concern about my approach
with the process.

"Now back to the topic: I think this is a good approach. It will avoid the fast producing
operator running ahead at the speed of writing to disk indefinitely. With this we will not
need to limit the spooling at all?"

Correct, spooling limit is not needed. Spooling size limit as a mechanism for back pressure
will not work because we do not know how much data will be generated between two commits (committed
windows). Also since the checkpoint length is configurable we cannot set it to some "reasonable"
high value. Hence, spooling size limit is something that would not be practical.

"What about the case where you have two subscribers (and those could be different operators)
where one can keep up with the rate at which data is published and the other one may be slow,
albeit maybe temporarily? This will slow down the fast subscriber and introduce latency."

Yes, you are correct, the slowest subscriber will slow down the publisher (unless it is parallel
partition all the way through). But, this is expected isn't it with back pressure.

"Let’s first address the process issue (it may warrant a separate discussion and additions
to the contributor guidelines also). If you think there was a conclusion then this may indicate
that there was offline discussion that isn’t captured here or anywhere else. Just by looking
at this ticket it is everything but clear what lead to your PR. This is not how the community
can work, discussion has to be in the open."

I think you have misunderstood my approach. When I created the JIRA, that started the discussion,
I was facing a problem with a production application and had proposed using window difference
as the way to create the back pressure and block publisher. Limiting spooling was suggested
as an approach by both you and David and in my comment on 02nd Nov 16th at 22:53 I mentioned
that it won't work because it will cause a deadlock. David's had another comment on this approach
about suspending publisher on spool limit till committed which is effectively the same deadlock
problem as the commit will not happen till publisher moves forward. There were no other approaches
suggested so I proceeded with attempting to solve the problem via the proposed window difference

As I got into the weeds of the implementation and figured out all the details of how the current
implementation works, I figured that instead of a window difference using the block difference
was a better way to accomplish this. To me, the fundamental approach I originally suggested
of blocking publisher till subscriber caught up hadn't changed rather an implementation detail.
Second, the majority of the time during the implementation was spent in how to accomplish
the task with the original assumption of window difference and coming to the conclusion to
use blocks instead of windows and the actual coding a day or two so what you see in the PR
is a relatively new discovery. In your comments, you have made a couple of statements, first
that there may have been offline discussions on the implementation. This has not happened,
I assure you. You are all seeing the implementation at the same time including the reviewers.
The second statement is stronger about this being detrimental to community and discussions
have to be open. I take personal offense to this statement. I know you want the best for the
community but suggest you ascertain the truth before making such strong statements.

> Prevent upstream operators from getting too far ahead when downstream operators are slow
> ----------------------------------------------------------------------------------------
>                 Key: APEXCORE-570
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-570
>             Project: Apache Apex Core
>          Issue Type: Improvement
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
> If the downstream operators are slower than upstream operators then the upstream operators
will get ahead and the gap can continue to increase. Provide an option to slow down or temporarily
pause the upstream operators when they get too far ahead.

This message was sent by Atlassian JIRA

View raw message