giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitay Joffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-494) Edge should be an interface
Date Thu, 31 Jan 2013 20:07:14 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568052#comment-13568052
] 

Nitay Joffe commented on GIRAPH-494:
------------------------------------

1) Agreed it's memory only, but it's actually closer to 8GB. We're on a 64-bit machine, pointers
at 8 bytes each. I don't think you could get a 32-bit JVM to load 1B+ edges. I have about
1.2B edges per worker so let's call it 10GB total. To me that does not seem like peanuts in
terms of active memory used.
2) I would argue that having Edge / MutableEdge as interfaces is the right way to go in terms
of object oriented design. This change does not make it impossible to change them we just
have to expose MutableEdge where changes are desired. If the algorithm knows it is using MutableEdge
then it stores those and can use them as such. We already have gotchas in the codebase like
RepresentativeVertex where the user needs to know that they shouldn't change Vertex/Edge objects
retrieved. If anything I think having clear cut interfaces like this does exactly the opposite
- it makes it explicitly clear what the API is and allows us to control it, rather than exposing
big Java objects with lots of public methods. 
                
> Edge should be an interface
> ---------------------------
>
>                 Key: GIRAPH-494
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-494
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>         Attachments: GIRAPH-494.patch
>
>
> In terms of architecture and for flexibility I think our Edge class should be an interface
instead of a real class. In this diff I change it to an interface and add a sub interface
called MutableEdge. The existing Edge class is now called DefaultEdge. Note that only one
class in our codebase actually needs a MutableEdge - RepresentativeVertex. Everything else
works perfectly fine using the immutable Edge interface.
> One nice thing this allowed me to do is to create a EdgeNoValue which we can use for
algorithms whose edges have no value at all. Currently the same functionality is achieved
by using NullWritable, however using EdgeNoValue means not storing a reference to the single
NullWritable instance in every single edge. Working on a job that reads 1B+ edges per worker,
a pointer per edge adds up.
> https://reviews.apache.org/r/9172/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message