reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Weimer (JIRA)" <>
Subject [jira] [Commented] (REEF-1456) Develop distributed counters framework in REEF
Date Thu, 23 Jun 2016 16:55:16 GMT


Markus Weimer commented on REEF-1456:

A couple of comments on {{IDistributedCounter}}:

  1. The methods {{Update}} and {{Reset}} require coordination amongst machines. To really
support these, we'd need something like ZooKeeper to keep the consensus between the machines.
Are we sure we want to tackle that first? Also, if we do, I suggest to rename the class from
{{*Counter}} to something else, as the semantics are not counting anymore.
  2. Do we want the serialization code inside the {{Counter}} class or do we want a {{CounterCodec}}
instead? Especially if we make these actual counters, the serialized form would contain only
the increment, not the value, right?

How about replacing {{Update}} with {{Increment}}? That way, the class is a counter, and the
distributed implementation is rather straight forward.

> Develop distributed counters framework in REEF
> ----------------------------------------------
>                 Key: REEF-1456
>                 URL:
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF.NET
>         Environment: C#
>            Reporter: Dhruv Mahajan
> The aim of this JIRA is to develop distributed counters framework in REEF. Each task
can emit pairs of {counter name, incremental value} which are aggregated and sent to driver,
which can then aggregate them from all the tasks/ evaluators. The aggregation strategy can
be simple addition, most recently used, etc.  Via these counters, we can implement some common
metrics for ML - like amount (bytes) of data read, current loss function value etc.
> We will develop interfaces for Counters for evaluators and driver along with default
implementations and aggregation strategies.

This message was sent by Atlassian JIRA

View raw message