hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@gmail.com>
Subject Re: pagerank NullPointerException
Date Wed, 19 Sep 2012 07:24:52 GMT
You can set this in the site config.

2012/9/19 Sandy Ding <sandy.dingxin@gmail.com>:
> So I have to recompile the pagerank example?
> Can I pass it as a parameter to the existing jar?
>
> 2012/9/19 Thomas Jungblut <thomas.jungblut@gmail.com>
>
>> Hey,
>>
>> if you read closely:
>>
>> http://wiki.apache.org/hama/WriteHamaGraphFile#Google_Web_dataset_.28local_mode.2C_pseudo_distributed_cluser.29
>>
>> You find that there is a property called "hama.graph.repair":
>>
>>     // hama takes care that the graph is complete
>>     pageJob.set("hama.graph.repair", "true");
>>
>> This basically sends messages along the known edges and adds vertices
>> if there aren't any on the "other side".
>>
>> If this isn't to scalable for you, then a preprocessing mapreduce job
>> is fine, where you emit the vertex id as key along with the complete
>> edge list as value, also the edge keys with an empty value.
>> In the reducer you should get either multiple complete lines or empty
>> values.
>> In the case you get only an empty value, you know that this vertex
>> wasn't included in the dataset and you can repair by emitting it in
>> the reducer as single line.
>>
>>
>> 2012/9/19 Sandy Ding <sandy.dingxin@gmail.com>:
>> > Hi, guys,
>> >
>> > The web-google dataset seems to miss some key sites, for example, there
>> is
>> > no entry starting with 111067.
>> > This leads to weird NullPointerException. How do you fix this?
>> >
>> > Cheers,
>> > Sandy
>>

Mime
View raw message