samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shadi Noghabi <>
Subject Re: what does the utilization metric in SamzaContainerMetrics show?
Date Fri, 26 Jun 2015 16:40:58 GMT

Thanks for your prompt reply. I just don’t see when will there be an idle
time that is not added to the activeMs but is counted in the totalMs.
Because as far as I can tell, if one of the process,window, or commit
takes more time it will be reflected in the activeMs value as well.

On 6/25/15, 7:47 PM, "Luis Fernando De Pombo" <> wrote:

>Hi Shadi,
>Thanks for asking. This metric tracks the utilization of the event loop
>within a samza container, which uses a single thread, that is in charge of
>reading and writing messages, flushing metrics, checkpointing, and
>windowing. It is important to track the utilization (aka "duty cycle") of
>any event loop, which is the sum of all the timings (activeMs) divided by
>the window length (totalMs).
>You are right in that most of the time this value will be close to 1,
>represents complete utilization. However when the event loop starts to
>idle time, this metric will give you an idea of how much headroom you have
>before the job will start to seriously fall behind.
>I hope that answers your question!
>On Thu, Jun 25, 2015 at 6:39 PM, Shadi Noghabi <
>> wrote:
>> Hi,
>> I was wondering what does this utilization metric in the
>> SamzaContainerMetrics show? I am asking this sine in the code it is
>> calculated as below:
>> while (!shutdownNow) {
>>   val loopStartTime = clock();
>>   process
>>   window
>>   commit
>>   val totalMs = clock() - loopStartTime
>>   metrics.utilization.set(activeMs.toFloat/totalMs)
>>   activeMs = 0L
>> }
>> Where the totalMs is the time it takes to start calling process until
>> commit is done which is equal to  the time it takes to run process,
>> and commit. And they way activeMs is calculated is by summing up the
>> it takes to call process, window and commit, which means these two
>> are going to be almost the same and the utilization is always going to
>> almost 1.
>> I was just wondering what is the point of doing this?

View raw message