incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <>
Subject [jira] [Commented] (HAMA-395) Example: PageRank
Date Mon, 13 Jun 2011 18:54:52 GMT


Thomas Jungblut commented on HAMA-395:

great :)

Currently crawling arround 105000 sites with their outlinks. Tomorrow I'm going to reduce
the dataset to an adjacency list and write a bsp parser for that.
I've decided to use Text.class as key and value, value is a semicolon seperated list of hosts.
Each element is representing a normalized host like or
So a key is the site and the value is a seperated list of outlinks.

Do you need any other input formatting Steve? 
As far as I can see it is parsing a textfile with the default delimiter of StringTokenizer
where the first element of a line is the page and the follow up elements are the outlinks.

> Example: PageRank
> -----------------
>                 Key: HAMA-395
>                 URL:
>             Project: Hama
>          Issue Type: Improvement
>          Components: bsp, examples
>    Affects Versions: 0.2.0
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>         Attachments: HAMA-395-v1.patch, HAMA-395-v2.patch, HAMA-395-v3.patch, HAMA-395.patch
> I'd like to contribute my PageRank BSP as an example. 
> - refactor the partitioning from the SSSP patch in
(extract an utility class etc)
> - add a really cool web-sub-graph example dataset ;D
> - add a wiki page for it

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message