giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-494) Edge should be an interface
Date Thu, 31 Jan 2013 19:15:14 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13567979#comment-13567979
] 

Alessandro Presta commented on GIRAPH-494:
------------------------------------------

Sorry I'm late to this party. I'll share my thoughts on this anyway:
1) Regarding EdgeNoValue, the same as in GIRAPH-493 applies: in principle it looks like getting
rid of a reference per edge might help, but let's verify that with a benchmark or two. For
one thing, it won't affect GC (references are not objects), so the only impact is in memory
usage. In your 1B edges/worker example, this amounts to 3.7GB. Compared to how much memory
we consume overall on that same worker (for this data size), you can argue it's peanuts.
2) "Note that only one class in our codebase actually needs a MutableEdge - RepresentativeVertex":
this is from the point of view of the implementation (RepresentativeVertex needs to reuse
edge objects). From an API level, it's somewhat of a gray area whether an algorithm should
be allowed to modify edge values in place. Ask Maja, who's trying to do exactly that. This
change makes it impossible (which is a possible solution; we just need to be clear on the
semantics of objects we hand to the user: is this a reference to an internal data structure?
Is it just a copy?).
                
> Edge should be an interface
> ---------------------------
>
>                 Key: GIRAPH-494
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-494
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>         Attachments: GIRAPH-494.patch
>
>
> In terms of architecture and for flexibility I think our Edge class should be an interface
instead of a real class. In this diff I change it to an interface and add a sub interface
called MutableEdge. The existing Edge class is now called DefaultEdge. Note that only one
class in our codebase actually needs a MutableEdge - RepresentativeVertex. Everything else
works perfectly fine using the immutable Edge interface.
> One nice thing this allowed me to do is to create a EdgeNoValue which we can use for
algorithms whose edges have no value at all. Currently the same functionality is achieved
by using NullWritable, however using EdgeNoValue means not storing a reference to the single
NullWritable instance in every single edge. Working on a job that reads 1B+ edges per worker,
a pointer per edge adds up.
> https://reviews.apache.org/r/9172/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message