incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Schema to store graph
Date Wed, 01 Apr 2009 08:33:47 GMT
Let's assume the graph looks like presented below:

1 - 2 - 3
 | /
4

We can now represent as:

  | 1  2  3  4
--+----------
1 | 0  1  0  1
2 | 1  0  1  1
3 | 0  1  0  0
4 | 1  0  1  0

We don't need to store the zeros, Hbase is ideal in storing sparse
matrices. So, It can be simply implemented using Hbase APIs as
describe below.

public class Graph {
  ...

  public void addEdge(String v, String w) {
    BatchUpdate update = new BatchUpdate(v);
    update.put(w, 1);
    table.commit(update);

    update = new BatchUpdate(w);
    update.put(v, 1);
    table.commit(update);
  }

  ...
  public static void main(String[] args) {
    Graph graph = new Graph();
    graph.addEdge("1", "2");

    graph.addEdge("1", "4");

    graph.addEdge("2", "3");

    graph.addEdge("2", "4");

    graph.addEdge("4", "1");

    graph.addEdge("4", "3");
  }
}

On Wed, Apr 1, 2009 at 4:47 PM, Amandeep Khurana <amansk@gmail.com> wrote:
> Right.
>
> Edward, I didnt understand what you were trying to say with this:
>
> Anyway, I guess If you store the graph like that, you'll only need update
> the row 'v/w' to add v to w's/w to v's list of neighbors.
>
> Can you explain it please?
>
> Thanks
> Amandeep
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Tue, Mar 31, 2009 at 8:02 PM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> One thing is Hbase 0.19 doesn't work with over 5,000 qualifier of one
>> column so I couldn't test/benchmark for large scale.
>>
>> On Tue, Mar 31, 2009 at 6:04 PM, Amandeep Khurana <amansk@gmail.com>
>> wrote:
>> > Response below
>> >
>> >
>> > Amandeep Khurana
>> > Computer Science Graduate Student
>> > University of California, Santa Cruz
>> >
>> >
>> > On Tue, Mar 31, 2009 at 1:58 AM, Edward J. Yoon <edwardyoon@apache.org
>> >wrote:
>> >
>> >> Hama store the sparse graph using Hbase as an sparse adjacency matrix.
>> >> One of reason is to perform matrix decomposition for large sparse
>> >> graphs. Anyway, I guess If you store the graph like that, you'll only
>> >> need update the row 'v/w' to add v to w's/w to v's list of neighbors.
>> >
>> >
>> > I didnt quite understand the last line here.
>> >
>> > I did think of a sparse matrix as well but not sure which is a better
>> > approach. Thats why I posted here...
>> >
>> > Share about your experiences with Hama...
>> >
>> >>
>> >>
>> >> Just FYI, You also may want to see --
>> >> http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html
>> >>
>> >> If you have any advice for us, Pls let us know.
>> >>
>> >> On Tue, Mar 31, 2009 at 5:09 PM, Amandeep Khurana <amansk@gmail.com>
>> >> wrote:
>> >> > What would be a good schema in HBase to store information pertaining
>> to a
>> >> > many to many graph? I was thinking of having the node id as the row
>> key,
>> >> the
>> >> > type of relation as the column family, the relation name for the
>> column
>> >> > identifier and the actual cell containing the key of the node that
is
>> >> being
>> >> > connected with.
>> >> >
>> >> >
>> >> > Amandeep Khurana
>> >> > Computer Science Graduate Student
>> >> > University of California, Santa Cruz
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> edwardyoon@apache.org
>> >> http://blog.udanax.org
>> >>
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> edwardyoon@apache.org
>> http://blog.udanax.org
>>
>



-- 
Best Regards, Edward J. Yoon
edwardyoon@apache.org
http://blog.udanax.org

Mime
View raw message