apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timothy Farkas <...@datatorrent.com>
Subject Re: Question about tweaking memory controls
Date Sun, 27 Sep 2015 05:52:23 GMT
Hi Vlad,

I just took a look at the CircularBuffer. Why are threads polling the state
of the buffer before doing operations? Couldn't polling be avoided entirely
by using something like Condition variables to signal when the buffer is
ready for an operation to be performed?

Tim

On Sat, Sep 26, 2015 at 10:42 PM, Vlad Rozov <v.rozov@datatorrent.com>
wrote:

> After looking at few stack traces I think that in the benchmark
> application operators compete for the circular buffer that passes slices
> from the emitter output to the consumer input and sleeps that avoid busy
> wait are too long for the benchmark operators. I don't see the stack
> similar to the one below all the time I take the threads dump, but still
> quite often to suspect that sleep is the root cause. I'll recompile with
> smaller sleep time and see how this will affect performance.
>
> ----
> "1/wordGenerator:RandomWordInputModule" prio=10 tid=0x00007f78c8b8c000
> nid=0x780f waiting on condition [0x00007f78abb17000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>     at java.lang.Thread.sleep(Native Method)
>     at
> com.datatorrent.netlet.util.CircularBuffer.put(CircularBuffer.java:182)
>     at com.datatorrent.stram.stream.InlineStream.put(InlineStream.java:79)
>     at com.datatorrent.stram.stream.MuxStream.put(MuxStream.java:117)
>     at
> com.datatorrent.api.DefaultOutputPort.emit(DefaultOutputPort.java:48)
>     at
> com.datatorrent.benchmark.RandomWordInputModule.emitTuples(RandomWordInputModule.java:108)
>     at com.datatorrent.stram.engine.InputNode.run(InputNode.java:115)
>     at
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1377)
>
> "2/counter:WordCountOperator" prio=10 tid=0x00007f78c8c98800 nid=0x780d
> waiting on condition [0x00007f78abc18000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>     at java.lang.Thread.sleep(Native Method)
>     at com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:519)
>     at
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1377)
>
> ----
>
>
> On 9/26/15 20:59, Amol Kekre wrote:
>
>> A good read -
>> http://preshing.com/20111118/locks-arent-slow-lock-contention-is/
>>
>> Though it does not explain order of magnitude difference.
>>
>> Amol
>>
>>
>> On Sat, Sep 26, 2015 at 4:25 PM, Vlad Rozov <v.rozov@datatorrent.com>
>> wrote:
>>
>> In the benchmark test THREAD_LOCAL outperforms CONTAINER_LOCAL by an order
>>> of magnitude and both operators compete for CPU. I'll take a closer look
>>> why.
>>>
>>> Thank you,
>>>
>>> Vlad
>>>
>>>
>>> On 9/26/15 14:52, Thomas Weise wrote:
>>>
>>> THREAD_LOCAL - operators share thread
>>>> CONTAINER_LOCAL - each operator has its own thread
>>>>
>>>> So as long as operators utilize the CPU sufficiently (compete), the
>>>> latter
>>>> will perform better.
>>>>
>>>> There will be cases where a single thread can accommodate multiple
>>>> operators. For example, a socket reader (mostly waiting for IO) and a
>>>> decompress (CPU hungry) can share a thread.
>>>>
>>>> But to get back to the original question, stream locality does generally
>>>> not reduce the total memory requirement. If you add multiple operators
>>>> into
>>>> one container, that container will also require more memory and that's
>>>> how
>>>> the container size is calculated in the physical plan. You may get some
>>>> extra mileage when multiple operators share the same heap but the need
>>>> to
>>>> identify the memory requirement per operator does not go away.
>>>>
>>>> Thomas
>>>>
>>>>
>>>> On Sat, Sep 26, 2015 at 12:41 PM, Munagala Ramanath <
>>>> ram@datatorrent.com>
>>>> wrote:
>>>>
>>>> Would CONTAINER_LOCAL achieve the same thing and perform a little better
>>>>
>>>>> on
>>>>> a multi-core box ?
>>>>>
>>>>> Ram
>>>>>
>>>>> On Sat, Sep 26, 2015 at 12:18 PM, Sandeep Deshmukh <
>>>>> sandeep@datatorrent.com>
>>>>> wrote:
>>>>>
>>>>> Yes, with this approach only two containers are required: one for stram
>>>>> and
>>>>>
>>>>> another for all operators. You can easily fit around 10 operators in
>>>>>> less
>>>>>> than 1GB.
>>>>>> On 27 Sep 2015 00:32, "Timothy Farkas" <tim@datatorrent.com>
wrote:
>>>>>>
>>>>>> Hi Ram,
>>>>>>
>>>>>>> You could make all the operators thread local. This cuts down
on the
>>>>>>> overhead of separate containers and maximizes the memory available
to
>>>>>>>
>>>>>>> each
>>>>>>
>>>>>> operator.
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>> On Sat, Sep 26, 2015 at 10:07 AM, Munagala Ramanath <
>>>>>>>
>>>>>>> ram@datatorrent.com
>>>>>> wrote:
>>>>>>
>>>>>>>    Hi,
>>>>>>>
>>>>>>>> I was running into memory issues when deploying my  app on
the
>>>>>>>>
>>>>>>>> sandbox
>>>>>>>
>>>>>> where all the operators were stuck forever in the PENDING state
>>>>>>
>>>>>>> because
>>>>>>>
>>>>>> they were being continually aborted and restarted because of the
>>>>>>
>>>>>>> limited
>>>>>>> memory on the sandbox. After some experimentation, I found that
the
>>>>>>>
>>>>>>>> following config values seem to work:
>>>>>>>> ------------------------------------------
>>>>>>>> <
>>>>>>>>
>>>>>>>> https://datatorrent.slack.com/archives/engineering/p1443263607000010
>>>>>>> >
>>>>>>>
>>>>>>>> *<property>    <name>dt.attr.MASTER_MEMORY_MB</name>
>>>>>>>>
>>>>>>>> <value>500</value>
>>>>>>>
>>>>>>>    </property>  <property>    <name>dt.application.​.operator.*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *​.attr.MEMORY_MB</name>    <value>200</value>
 </property>
>>>>>>>>
>>>>>>>> <property>
>>>>>>>
>>>>>>
>>>>> <name>dt.application.TopNWordsWithQueries.operator.fileWordCount.attr.MEMORY_MB</name>
>>>>>
>>>>>      <value>512</value>  </property>*
>>>>>>
>>>>>>> ------------------------------------------------
>>>>>>>> Are these reasonable values ? Is there a more systematic
way of
>>>>>>>>
>>>>>>>> coming
>>>>>>>
>>>>>> up
>>>>>>
>>>>>> with these values than trial-and-error ? Most of my operators --
with
>>>>>>> the
>>>>>>> exception of fileWordCount -- need very little memory; is there
a way
>>>>>>> to
>>>>>>> cut all values down to the bare minimum and maximize available
memory
>>>>>>> for
>>>>>>> this one operator ?
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Ram
>>>>>>>>
>>>>>>>>
>>>>>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message