giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavan Kumar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (GIRAPH-873) Specialized edge stores
Date Mon, 26 May 2014 07:48:01 GMT

     [ https://issues.apache.org/jira/browse/GIRAPH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pavan Kumar updated GIRAPH-873:
-------------------------------

    Attachment: GIRAPH-873_refactor.patch

here's the refactored code - 
ran pagerank on cluster successfully
mvn clean verify on giraph-core


> Specialized edge stores
> -----------------------
>
>                 Key: GIRAPH-873
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-873
>             Project: Giraph
>          Issue Type: Improvement
>    Affects Versions: 1.1.0
>            Reporter: Craig Muchinsky
>            Assignee: Pavan Kumar
>             Fix For: 1.1.0
>
>         Attachments: GIRAPH-873-2.patch, GIRAPH-873.patch, GIRAPH-873_refactor.patch
>
>
> While doing some performance tuning I discovered that loading the edge store can be a
very expensive operation. Similar to GIRAPH-704, the use of primitive maps can provide significant
performance benefit. Part of the benefit comes with the lower memory overhead associated with
the primitive maps however the larger benefit comes with the fact that you don't have to release
and reconstruct the vertexId object every time a new vertex is encountered.
> When processing a large graph with 4B vertices and 5B edges (3B of the edges loaded via
EdgeInputFormat) the worker edge requests were taking ~15 seconds each, but after implementing
the above suggestions that number dropped down sub-second.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message