Return-Path: Delivered-To: apmail-incubator-hama-dev-archive@minotaur.apache.org Received: (qmail 53362 invoked from network); 1 Apr 2009 08:34:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Apr 2009 08:34:18 -0000 Received: (qmail 99580 invoked by uid 500); 1 Apr 2009 08:34:18 -0000 Delivered-To: apmail-incubator-hama-dev-archive@incubator.apache.org Received: (qmail 99558 invoked by uid 500); 1 Apr 2009 08:34:18 -0000 Mailing-List: contact hama-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hama-dev@incubator.apache.org Delivered-To: mailing list hama-dev@incubator.apache.org Received: (qmail 99548 invoked by uid 99); 1 Apr 2009 08:34:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2009 08:34:18 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.198.244] (HELO rv-out-0708.google.com) (209.85.198.244) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2009 08:34:09 +0000 Received: by rv-out-0708.google.com with SMTP id b17so2859484rvf.0 for ; Wed, 01 Apr 2009 01:33:47 -0700 (PDT) MIME-Version: 1.0 Sender: edward@udanax.org Received: by 10.141.196.8 with SMTP id y8mr3927184rvp.101.1238574827410; Wed, 01 Apr 2009 01:33:47 -0700 (PDT) In-Reply-To: <35a22e220904010047l236f3936je794685d2d51ccc@mail.gmail.com> References: <35a22e220903310109t5696a5d8v78bfa38640b82207@mail.gmail.com> <35a22e220903310204l32203734x6f8280d95d2e26c1@mail.gmail.com> <35a22e220904010047l236f3936je794685d2d51ccc@mail.gmail.com> Date: Wed, 1 Apr 2009 17:33:47 +0900 X-Google-Sender-Auth: c4c364bf33d92392 Message-ID: Subject: Re: Schema to store graph From: "Edward J. Yoon" To: hama-dev@incubator.apache.org Cc: hbase-user@hadoop.apache.org, mahout-dev@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Let's assume the graph looks like presented below: 1 - 2 - 3 | / 4 We can now represent as: | 1 2 3 4 --+---------- 1 | 0 1 0 1 2 | 1 0 1 1 3 | 0 1 0 0 4 | 1 0 1 0 We don't need to store the zeros, Hbase is ideal in storing sparse matrices. So, It can be simply implemented using Hbase APIs as describe below. public class Graph { ... public void addEdge(String v, String w) { BatchUpdate update = new BatchUpdate(v); update.put(w, 1); table.commit(update); update = new BatchUpdate(w); update.put(v, 1); table.commit(update); } ... public static void main(String[] args) { Graph graph = new Graph(); graph.addEdge("1", "2"); graph.addEdge("1", "4"); graph.addEdge("2", "3"); graph.addEdge("2", "4"); graph.addEdge("4", "1"); graph.addEdge("4", "3"); } } On Wed, Apr 1, 2009 at 4:47 PM, Amandeep Khurana wrote: > Right. > > Edward, I didnt understand what you were trying to say with this: > > Anyway, I guess If you store the graph like that, you'll only need update > the row 'v/w' to add v to w's/w to v's list of neighbors. > > Can you explain it please? > > Thanks > Amandeep > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > > > On Tue, Mar 31, 2009 at 8:02 PM, Edward J. Yoon wrote: > >> One thing is Hbase 0.19 doesn't work with over 5,000 qualifier of one >> column so I couldn't test/benchmark for large scale. >> >> On Tue, Mar 31, 2009 at 6:04 PM, Amandeep Khurana >> wrote: >> > Response below >> > >> > >> > Amandeep Khurana >> > Computer Science Graduate Student >> > University of California, Santa Cruz >> > >> > >> > On Tue, Mar 31, 2009 at 1:58 AM, Edward J. Yoon > >wrote: >> > >> >> Hama store the sparse graph using Hbase as an sparse adjacency matrix. >> >> One of reason is to perform matrix decomposition for large sparse >> >> graphs. Anyway, I guess If you store the graph like that, you'll only >> >> need update the row 'v/w' to add v to w's/w to v's list of neighbors. >> > >> > >> > I didnt quite understand the last line here. >> > >> > I did think of a sparse matrix as well but not sure which is a better >> > approach. Thats why I posted here... >> > >> > Share about your experiences with Hama... >> > >> >> >> >> >> >> Just FYI, You also may want to see -- >> >> http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html >> >> >> >> If you have any advice for us, Pls let us know. >> >> >> >> On Tue, Mar 31, 2009 at 5:09 PM, Amandeep Khurana >> >> wrote: >> >> > What would be a good schema in HBase to store information pertaining >> to a >> >> > many to many graph? I was thinking of having the node id as the row >> key, >> >> the >> >> > type of relation as the column family, the relation name for the >> column >> >> > identifier and the actual cell containing the key of the node that is >> >> being >> >> > connected with. >> >> > >> >> > >> >> > Amandeep Khurana >> >> > Computer Science Graduate Student >> >> > University of California, Santa Cruz >> >> > >> >> >> >> >> >> >> >> -- >> >> Best Regards, Edward J. Yoon >> >> edwardyoon@apache.org >> >> http://blog.udanax.org >> >> >> > >> >> >> >> -- >> Best Regards, Edward J. Yoon >> edwardyoon@apache.org >> http://blog.udanax.org >> > -- Best Regards, Edward J. Yoon edwardyoon@apache.org http://blog.udanax.org