giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-26) Improve PseudoRandomVertexInputFormat to create a more realistic synthetic graph (e.g. power-law distributed vertex-cardinality).
Date Thu, 27 Sep 2012 22:45:07 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465192#comment-13465192
] 

Eli Reisman commented on GIRAPH-26:
-----------------------------------

That sounds great, I'd really like to get this committed. The COLT thing is a good point,
I am not sure if Sean is directly involved or on the mailing lists right now as far as updating
to the Mahout libs. Sean, are you out there? I could try it, but I'm not mathy enough a present
to know if I'm damaging it in the process. Anyone want to have a swipe at fixing it for him?

                
> Improve PseudoRandomVertexInputFormat to create a more realistic synthetic graph (e.g.
power-law distributed vertex-cardinality).
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-26
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-26
>             Project: Giraph
>          Issue Type: Test
>          Components: benchmark
>    Affects Versions: 0.2.0
>            Reporter: Jake Mannix
>            Assignee: Sean Choi
>            Priority: Minor
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-26-2.patch, GIRAPH-26-3.patch, GIRAPH-26.patch
>
>
> The PageRankBenchmark class, to be a proper benchmark, should run over graphs which look
more like data seen in the wild, and web link graphs, social network graphs, and text corpora
(represented as a bipartite graph) all have power-law distributions, so benchmarking a synthetic
graph which looks more like this would be a nice test which would stress cases of uneven split-distribution
and bottlenecks of subclusters of the graph of heavily connected vertices.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message