incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: on the semantics of the combiner
Date Mon, 09 Jan 2012 19:28:20 GMT
The javadoc for VertexCombiner#combine() is

   /**
    * Combines message values for a particular vertex index.
    *
    * @param vertexIndex Index of the vertex getting these messages
    * @param msgList List of the messages to be combined
    * @return Message that is combined from {@link MsgList} or null if no
    *         message it to be sent
    * @throws IOException
    */

I think we are somewhat vague on what a combiner can return to support 
various use cases.  A combiner should be particular to a particular 
compute() algorithm.  I think it should be legal to return null from a 
combiner, in that case, no message should be sent to that vertex.

It seems like it would be an overhead to call a combiner when there are 
0 messages.  I can't see a case where that would be useful.  Perhaps we 
should change the javadoc to insure that msgList must contain at least 
one message to have combine() being called.

Avery

On 1/9/12 5:37 AM, Claudio Martella wrote:
> Hi Sebastian,
>
> yes, that was my point, I agree completely with you.
> Fixing my test was not the issue, my question was whether we want to
> define explicitly the semantics of this scenario.
> Personally, I believe the combiner should be ready to receive 0
> messages, as it's the case of BasicVertex::initialize(), putMessages()
> and compute(), and act accordingly.
>
> In the particular example, I believe the SimpleSumCombiner is bugged.
> It's true that the sum of no values is 0, but it's also true that the
> null return semantics of combine() is more suitable for this exact
> situation.
>
>
> On Mon, Jan 9, 2012 at 2:21 PM, Sebastian Schelter<ssc@apache.org>  wrote:
>> I think we currently implicitly assume that there is at least one
>> element in the Iterable passed to the combiner. The messaging code only
>> invokes the combiner only if at least one message for the target vertex
>> has been sent.
>>
>> However, we should not rely on implicit implementation details but
>> explicitly specify the semantics of combiners.
>>
>> --sebastian
>>
>> On 09.01.2012 13:29, Claudio Martella wrote:
>>> Hello list,
>>>
>>> for GIRAPH-45 I'm touching the incoming messages and hit an
>>> interesting problem with the combiner semantics.
>>> currently, my code fails testBspCombiner for the following reason:
>>>
>>> SimpleSumCombiner::compute() returns a value even if there are no
>>> messages in the iterator (in this case it returns 0) and for this
>>> reason the vertices get activated at each superstep.
>>>
>>> At each superstep, under-the-hood, I pass the combiner for each vertex
>>> an Iterable, which can be empty:
>>>
>>>      public Iterable<M>  getMessages(I vertexId) {
>>>        Iterable<M>  messages = inMessages.getMessages(vertexId);
>>>        if (combiner != null) {
>>>                M combinedMsg;
>>>                try {
>>>                        combinedMsg = combiner.combine(vertexId, messages);
>>>                }  catch (IOException e) {
>>>                        throw new RuntimeException("could not combine", e);
>>>                }
>>>                if (combinedMsg != null) {
>>>                        List<M>  tmp = new ArrayList<M>(1);
>>>                        tmp.add(combinedMsg);
>>>                        messages = tmp;
>>>                } else {
>>>                        messages = new ArrayList<M>(0);
>>>                }
>>>        }
>>>        return messages;
>>>      }
>>>
>>> the Iterable returned by this methods is passed to
>>> basicVertex.putMessages() right before the compute().
>>> Now, the question is: who's wrong? The combiner code that returns a
>>> sum of 0 over no values, or the framework that calls the combiner with
>>> 0 messages?
>>>
>>>
>>>
>
>


Mime
View raw message