Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B2751E3BF for ; Thu, 31 Jan 2013 20:07:15 +0000 (UTC) Received: (qmail 68905 invoked by uid 500); 31 Jan 2013 20:07:14 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 68827 invoked by uid 500); 31 Jan 2013 20:07:14 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 68806 invoked by uid 500); 31 Jan 2013 20:07:14 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 68781 invoked by uid 99); 31 Jan 2013 20:07:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 Jan 2013 20:07:14 +0000 Date: Thu, 31 Jan 2013 20:07:14 +0000 (UTC) From: "Nitay Joffe (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GIRAPH-494) Edge should be an interface MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568052#comment-13568052 ] Nitay Joffe commented on GIRAPH-494: ------------------------------------ 1) Agreed it's memory only, but it's actually closer to 8GB. We're on a 64-bit machine, pointers at 8 bytes each. I don't think you could get a 32-bit JVM to load 1B+ edges. I have about 1.2B edges per worker so let's call it 10GB total. To me that does not seem like peanuts in terms of active memory used. 2) I would argue that having Edge / MutableEdge as interfaces is the right way to go in terms of object oriented design. This change does not make it impossible to change them we just have to expose MutableEdge where changes are desired. If the algorithm knows it is using MutableEdge then it stores those and can use them as such. We already have gotchas in the codebase like RepresentativeVertex where the user needs to know that they shouldn't change Vertex/Edge objects retrieved. If anything I think having clear cut interfaces like this does exactly the opposite - it makes it explicitly clear what the API is and allows us to control it, rather than exposing big Java objects with lots of public methods. > Edge should be an interface > --------------------------- > > Key: GIRAPH-494 > URL: https://issues.apache.org/jira/browse/GIRAPH-494 > Project: Giraph > Issue Type: Bug > Reporter: Nitay Joffe > Assignee: Nitay Joffe > Attachments: GIRAPH-494.patch > > > In terms of architecture and for flexibility I think our Edge class should be an interface instead of a real class. In this diff I change it to an interface and add a sub interface called MutableEdge. The existing Edge class is now called DefaultEdge. Note that only one class in our codebase actually needs a MutableEdge - RepresentativeVertex. Everything else works perfectly fine using the immutable Edge interface. > One nice thing this allowed me to do is to create a EdgeNoValue which we can use for algorithms whose edges have no value at all. Currently the same functionality is achieved by using NullWritable, however using EdgeNoValue means not storing a reference to the single NullWritable instance in every single edge. Working on a job that reads 1B+ edges per worker, a pointer per edge adds up. > https://reviews.apache.org/r/9172/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira