apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Narsude (cnarsude)" <cnars...@cisco.com>
Subject Re: [GitHub] incubator-apex-core pull request: APEX-254 & APEX-269
Date Thu, 19 Nov 2015 06:01:05 GMT


On 11/18/15, 9:21 PM, "Vlad Rozov" <v.rozov@datatorrent.com> wrote:

>Connecting fast producing operator to a slow consuming operator using
>CONTAINER_LOCAL port will be a bad application design decision anyway,
>as it will slowdown producer.

Hmmm. I do not agree with the bad design part. Some systems will
inherently be designed to work as fast as the slowest running operator
with the resource constraints in place. Besides It may not always be by
design. This could be a temporary surge. Thirdly - the work we did to make
sure that buffer server stalls the publisher just a few days ago would not
be needed if we decide regardless that this is a bad design decision.

>In this case it will be better to
>partition slow downstream operator. Do we currently support partitioning
>of an operator and deploying all partitions into the same container
>(CONTAINER_LOCAL)?
> What stream will be used for multiplexing?
>InlineStream only supports single not partitioned stream.

MuxStream. Even though Apex supports this - I had added this functionality
for a different reason - to simplify the logic via partitioning. In
reality - most distributed system developers will give more resources to
the same operator instead of adding even more overhead of parititioning
and unifying on the same machine.

The above are just to answer the questions you asked - the overarching
point I want to make is that the queue has a specific size because it’s
possible that at some point all the slots in the queue will be filled. In
asynchronous system such as Apex, even a very big queue will get filled
very often. So we have to be conservative on how big we allow it to grow.

At the same time - the benchmark that I had devised is for near ideal
conditions for operators. Both the operators are taking almost zero time
to create and process the events. In almost all useful applications the
operators are going to take non zero time and the optimization using SPSC
are going to result in insignificant, if not zero, improvement in most
applications' performance. On the contrary - the additional memory
requirements to get that performance will have instability due to OOM
errors as the slowest operator in the application will put back pressure
on faster upstream operators by filling up the queues.

TLDR; increasing queue size to increase the speed is not an option as it
will come at the cost of disproportionately large amount of RAM and
problems associated with needing a lot of RAM.

>
>Thank you,
>
>Vlad
>
>On 11/18/15 16:01, Chetan Narsude (cnarsude) wrote:
>> On 11/18/15, 2:43 PM, "Vlad Rozov" <v.rozov@datatorrent.com> wrote:
>>
>>> Based on the current performance testing I plan to change default value
>>> of PortContext.QUEUE_CAPACITY from 1024 to 1<<19 (still need to do my
>>> homework and see where PortContext.QUEUE_CAPACITY is used).
>> There are 2 different aspects that need to be considered here. One is
>> speed of the messaging bus (that you are focusing on) and the second one
>> is the speed of the operator. What operator cannot process fast enough
>> sits in the queue and stresses the RAM. That¹s the reason the
>> queue_capacity default is kept low. I also have a suspicion that this
>>may
>> cause regression failure and hence is not binary compatible.
>>
>> ‹
>> Chetan
>>
>>
>>
>>> Thank you,
>>>
>>> Vlad
>>>
>>> On 11/18/15 11:23, Chetan Narsude (cnarsude) wrote:
>>>> What are we doing for size() being inaccurate with spsc?
>>>>
>>>> On 11/18/15, 8:55 AM, "vrozov" <git@git.apache.org> wrote:
>>>>
>>>>> Github user vrozov commented on the pull request:
>>>>>
>>>>>      
>>>>>
>>>>> 
>>>>>https://github.com/apache/incubator-apex-core/pull/173#issuecomment-15
>>>>>77
>>>>> 77
>>>>> 188
>>>>>    
>>>>>      There are 3 applications running on the dev cluster:
>>>>>      
>>>>>      SpscArrayQueueReservoir around 25 million tuples/s
>>>>>      CircularBufferReservoir - around 10 million tuples/s
>>>>>      ArrayBlockingQueueReservoir - around 1.5 million tuples/s
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> If your project is set up for it, you can reply to this email and
>>>>>have
>>>>> your
>>>>> reply appear on GitHub as well. If your project does not have this
>>>>> feature
>>>>> enabled and wishes so, or if the feature is enabled but not working,
>>>>> please
>>>>> contact infrastructure at infrastructure@apache.org or file a JIRA
>>>>> ticket
>>>>> with INFRA.
>>>>> ---
>

Mime
View raw message