giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-314) Implement better message grouping to improve performance in SimpleTriangleClosingVertex
Date Thu, 06 Sep 2012 01:42:07 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449348#comment-13449348
] 

Eli Reisman commented on GIRAPH-314:
------------------------------------

Actually, what I'm looking at here for "stage 2" is looking a lot like the data structure
from sendMessagesToAllVertex down to the run-length encoded request is going to map like M
-> partitionId -> Set<I> so that its easy to make sure we aren't keeping any extra
copies of the same message on the way. This doesn't help with my application as messages are
likely to be more often unique than individual destinations will be, but not for everyone
so this might be a better mapping for a general use case of this deduplicating feature and
should not hurt my use case in the process.

Anyway, I'll post it soon so we can address shortcomings!
                
> Implement better message grouping to improve performance in SimpleTriangleClosingVertex
> ---------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-314
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-314
>             Project: Giraph
>          Issue Type: Improvement
>          Components: examples
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>            Priority: Trivial
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-314-1.patch, GIRAPH-314-2.patch, GIRAPH-314-3.patch, GIRAPH-314-4.patch
>
>
> After running SimpleTriangleClosingVertex at scale I'm thinking the sendMessageToAllEdges()
is pretty in the code, but its not a good idea in practice since each vertex V sends degree(V)^2
messages right in the first superset in this algorithm. Could do something with a combiner
etc. but just grouping messages by hand at the application level by using IntArrayListWritable
again does the trick fine.
> Probably should have just done it this way before, but sendMessageToAllEdges() looked
so nice. Sigh. Changed unit tests to reflect this new approach, passes mvn verify and cluster,
etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message