giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Panagiotis Eustratiadis <ep.pan....@gmail.com>
Subject Re: Fw: Persistent Aggregator Forgets Previously Aggregated Value ????
Date Mon, 17 Nov 2014 09:06:21 GMT
Hello Puneet,

I am encountering a very similar problem, and I'm studying it. I will
update this thread as soon as I find something. I think it has something to
do with the code that initializes the aggregators, but I'm not quite sure.

Regards,
Eustratiadis Panagiotis.

On 17 November 2014 05:25, Puneet Agarwal <puagarwal@yahoo.com> wrote:

> I have no clue on how to resolve this issue. Any guidance in this
> situation can be helpful.
>
> Claudio - is there any detailed description of how aggregator works in
> Giraph in your book?
>
> Getting no response from anyone, should such questions be asked in
> developers' list?
>
> -Puneet
> IIT Delhi, India
>
>
>    On Sunday, November 16, 2014 12:11 PM, Puneet Agarwal <
> puagarwal@yahoo.com> wrote:
>
>
> Hi All,
>
> I have been facing a problem in Giraph for last two weeks. Following is
> the detailed description, sample code, corresponding log message and a few
> questions.
> I will greatly appreciate any help in this regard.
>
> Code Description:
> =================
> My algorithm assigns a score to select vertexes of the graph. I have to
> find the best K(=5) vertexes such that their score is minimum.
> I have written a Persistent Aggregator for this which takes Text value.
> The snippet of the code is given below.
>
> It works fine on my laptop (single worker), but not on the cluster of 4
> computers, when running with 10 workers.
> I am using Hadoop 1.0.0 and Giraph 1.0.0.
>
>
> Problem Description:
> ====================
> In the aggregator one of the previously aggregated value suddenly gets
> lost.
> Therefore, my algorithm generates wrong results.
>
> You can observe this in the logs, given below, in following manner
> a.    At line number 11, the vertex 8.4793-1 has weight 3.0, and it is the
> best so far.
> b.    At line number 12, the aggregator receives a new value
> "6.1100-1,6.0", i.e., vertex=6.1100-1 has weight=6.0
> c.    At line number 13, in the begining of aggregate method, the output
> of getAggregatedValue() is also same as newValue.
> This means it simply forgets the previously aggregated value, why so?
>
> I have printed object-ids also, in the log messages, in order to ensure
> that this is happening in the same object.
>
> Source Code Snippet:
> ====================
>
> @Override
> public void aggregate(Text value) {
>     printLogMessage("aggregate", "MyAGG", "Entering - NewValue=", value);
>     printLogMessage("aggregate", "MyAGG",
>             "Entering - getAggregatedValue()=", getAggregatedValue());
>
>         // Main Logic is here - Omitting for simplicity
>
>     setAggregatedValue(new Text(myTopK));
>     printLogMessage("aggregate", "MyAGG", "Exiting AggregatedValue=",
>             getAggregatedValue().toString());
> }
>
> @Override
> public Text createInitialValue() {
>     printLogMessage("=====>createInitialValue", "MyAGG",
>             "Entering getAggregatedValue()=", getAggregatedValue());
>     Text initialValue = getAggregatedValue();
>     if (initialValue == null) {
>         initialValue = new Text();
>     }
>     printLogMessage("=====>createInitialValue", "MyAGG",
>             "Exiting ReturnedValue=", getAggregatedValue());
>     return initialValue;
> }
> private void printLogMessage(String methodName, String prefix, String msg,
> Object obj) {
>     if (isDebugOn(methodName)) {
>         if (obj == null) {
>             obj = "";
>         }
>         System.out.println(java.lang.System.identityHashCode(this) + "--"+
> prefix + "==>" +
>                                 methodName + "() - " + msg +
> obj.toString());
>     }
> }
>
> Printed Log Messages:
> =====================
> 1.  %%%%%%%%%%%%% Entered Master Compute %%%%%%%%%%%%%%%%%%%% ====>
> SupsetStep No.=2
> 2.  249768912--MyAGG==>=====>createInitialValue() - Entering
> getAggregatedValue()=
> 3.  249768912--MyAGG==>=====>createInitialValue() - Exiting ReturnedValue=
> 4.  249768912--MyAGG==>aggregate() - Entering - NewValue=8.4793-1,3.0
> 5.  249768912--MyAGG==>aggregate() - Entering -
> getAggregatedValue()=8.4793-1,3.0
> 6.  249768912--MyAGG==>aggregate() - Received AggregatedValue Again-
> Exiting CurrVal=8.4793-1,3.0
> 7.  249768912--MyAGG==>=====>createInitialValue() - Entering
> getAggregatedValue()=8.4793-1,3.0
> 8.  249768912--MyAGG==>=====>createInitialValue() - Exiting
> ReturnedValue=8.4793-1,3.0
> 9.  %%%%%%%%%%%%% Entered Master Compute %%%%%%%%%%%%%%%%%%%% ====>
> SupsetStep No.=3
> 10. 249768912--MyAGG==>=====>createInitialValue() - Entering
> getAggregatedValue()=8.4793-1,3.0
> 11. 249768912--MyAGG==>=====>createInitialValue() - Exiting
> ReturnedValue=8.4793-1,3.0
> 12. 249768912--MyAGG==>aggregate() - Entering - NewValue=6.1100-1,6.0
> 13. 249768912--MyAGG==>aggregate() - Entering -
> getAggregatedValue()=6.1100-1,6.0
> 14. 249768912--MyAGG==>aggregate() - Received AggregatedValue Again-
> Exiting CurrVal=6.1100-1,6.0
> 15. 249768912--MyAGG==>=====>createInitialValue() - Entering
> getAggregatedValue()=6.1100-1,6.0
> 16. 249768912--MyAGG==>=====>createInitialValue() - Exiting
> ReturnedValue=6.1100-1,6.0
> 17. %%%%%%%%%%%%% Entered Master Compute %%%%%%%%%%%%%%%%%%%% ====>
> SupsetStep No.=4
> 18. ...
> ...
>
> My Queries:
> ===========
> 1.    Why does the createInitialValue method gets called twice, in the
> same superstep, on the same object of the aggregator?
> 2.    Even if its gets called twice, why should it forget the previouly
> aggregated value?
>
> Please help me resolve this issue. This is a major problem in my work.
> I am not able to proceed further because of this.
>
> Thanks in anticipation.
>
> - Puneet
> IIT Delhi, India
>
>
>
>

Mime
View raw message