hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tommaso Teofili <tommaso.teof...@gmail.com>
Subject Re: Dynamic vertices and hama counters
Date Mon, 15 Jul 2013 07:51:59 GMT
what about introducing a proper API for counting vertices, something like
an interface VertexCounter with 2-3 implementations like
InMemoryVertexCounter (basically the current one), a
DistributedVertexCounter to implement the scenario where we use a separate
BSP superstep to count them and a ZKVertexCounter which handles vertices
counts as per Chian-Hung's suggestion.

Also we may introduce something like a configuration variable to define if
all the vertices are needed or just the neighbors (and/or some other
strategy).

My 2 cents,
Tommaso

2013/7/14 Chia-Hung Lin <clin4j@googlemail.com>

> Just my personal viewpoint. For small size of global information,
> considering to store the state in ZooKeeper might be a reasonable
> solution.
>
> On 13 July 2013 21:28, andronat_asf <andronat_asf@hotmail.com> wrote:
> > Hello everyone,
> >
> > I'm working on HAMA-767 and I have some concerns on counters and
> scalability. Currently, every peer has a set of vertices and a variable
> that is keeping the total number of vertices through all peers. In my case,
> I'm trying to add and remove vertices during the runtime of a job, which
> means that I have to update all those variables.
> >
> > My problem is that this is not efficient because in every operation (add
> or remove a vertex) I need to update all peers, so I need to send lots of
> messages to make those updates (see GraphJobRunner#countGlobalVertexCount
> method) and I believe this is not correct and scalable. An other problem is
> that, even if I update all those variable (with the cost of sending lots of
> messages to every peer) those variables will be updated on the next
> superstep.
> >
> > e.g.:
> >
> > Peer 1:                            Peer 2:
> >   Vert_1                              Vert_2
> > (Total_V = 2)                  (Total_V = 2)
> > addVertex()
> > (Total_V = 3)
> >                                          getNumberOfV() => 2
> >
> > ------------------------ Sync ------------------------
> >
> >                                          getNumberOfV() => 3
> >
> >
> > Is there something like global counters or shared memory that it can
> address this issue?
> >
> > P.S. I have a small feeling that we don't need to track the total amount
> of vertices because vertex centered algorithms rarely need total numbers,
> they only depend on neighbors (I might be wrong though).
> >
> > Thanks,
> > Anastasis
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message