reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruv Mahajan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1456) Develop distributed counters framework in REEF
Date Thu, 23 Jun 2016 19:01:16 GMT

    [ https://issues.apache.org/jira/browse/REEF-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346988#comment-15346988
] 

Dhruv Mahajan commented on REEF-1456:
-------------------------------------

So, regarding 1,  I am not sure why we need synchronization with other machines. The {{Update}}
function is supposed to be the local update to the counter. For example, suppose there is
a counter keeping track of how many data bytes are read. Then local each task will maintain
the {{IDistributedCounter}} for that. It is possible that before increment is sent to driver,
the task updates/increments counter many times.{{Update}} and {{Reset}} properties are provided
for local use only. On the task side for a {{IDistributedCounter}} {{c}}, the sequence might
be, {{c.Update(value1)}}, {{c.Update(value2)}}, then serialization happens and value is sent
to driver, and {{c.Reset()}} is called. All these are local calls per task.

Then on the driver side, it will have he corresponding distributed counter say {{cuniv}}.
It will keep calling {{cuniv.Update(value)}} when it receives the incremental value from the
tasks.

May be we should divide the interface in two parts.

{code}
    public interface IDistributedDriverCounter<T>
    {
        /// <summary>
        /// Unique name of the counter. No two counters should have same names.
        /// </summary>
        string Name {get; private set;}

        /// <summary>
        /// Value of the counter
        /// </summary>
        T Counter {get; private set;}

        /// <summary>
        /// Updates the counter by value. For += update these are simple 
        /// increments while for Most recently updated (MRU) case this is the latest 
        /// value. Actually for MRU case T itself is a structure with the value and 
        /// timestamp.
        /// </summary>
        /// <typeparam name="T">Type of counter.</typeparam>
        /// <param name="value">Value to update the counter.</param>
        void Update(ref T value);

        /// <summary>
        /// Whether the value of the counter is default or not. Useful to 
        /// check during serialization to determine whether to send it or not.
        /// </summary>
        /// <returns>True if value os default, False otherwise.</returns>
        bool EqualsDefault();
    }
{code}
and 

{code}
 public interface IDistributedTaskCounter<T> : IDistributedDriverCounter<T>
 {
        /// <summary>
        /// Resets the counter to default value.
        /// </summary>
        void Reset();
 }
{code}

This way on driver side counters do not have access to reset function since they do not need
it.

Yes we should have separate codecs. I already realized that and doing changes. Yes, only increment
will be sent. That is why I have a {{Reset}} function.
 

> Develop distributed counters framework in REEF
> ----------------------------------------------
>
>                 Key: REEF-1456
>                 URL: https://issues.apache.org/jira/browse/REEF-1456
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF.NET
>         Environment: C#
>            Reporter: Dhruv Mahajan
>
> The aim of this JIRA is to develop distributed counters framework in REEF. Each task
can emit pairs of {counter name, incremental value} which are aggregated and sent to driver,
which can then aggregate them from all the tasks/ evaluators. The aggregation strategy can
be simple addition, most recently used, etc.  Via these counters, we can implement some common
metrics for ML - like amount (bytes) of data read, current loss function value etc.
> We will develop interfaces for Counters for evaluators and driver along with default
implementations and aggregation strategies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message