incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <>
Subject Re: [Hama Wiki] Update of "GraphPackage" by HyunsikChoi
Date Fri, 18 Sep 2009 10:19:27 GMT
As we discussed before, we considering to re-factor the packages as below:


So, I'd like to re-arrange the architecture page and discuss about it
before commit them.

This is my rough idea.

- Our Goal
- About BSP and Map/Reduce
- Matrix Computing Strategies
- Graph Computing Strategies
- Example in matrix and graph computation areas

What do you think?

On Fri, Sep 18, 2009 at 6:51 PM, Edward J. Yoon <> wrote:
> In distributed system, the matrix and graph computation are both need
> a lot of communication between each nodes. IMO, there is no way to
> avoid them. Of course, It could be performed by M/R iterations. But it
> seems very slow and there's an overhead cost. I think that's why we'd
> like to survey and consider the BSP (bulk synchronous parallel) model.
> - We need to explain theoretically about the BSP and How to apply out project.
> And, regarding matrix and graph, they are closely connected. I expect
> the synergy between two. However, I think we should clear the
> relationship between matrix and graph. and our main goal.
> Any advices are welcome.
> On Fri, Sep 18, 2009 at 6:30 PM, Edward J. Yoon <> wrote:
>> Firstly, We need to share our plans and consider about overall architecture.
>> What's the BSP? What's the relationship between matrix and graph?
>> What's the plan of matrix and graph packages? What's the our main
>> goal?
>> On Fri, Sep 18, 2009 at 5:52 PM, Apache Wiki <> wrote:
>>> Dear Wiki user,
>>> You have subscribed to a wiki page or wiki category on "Hama Wiki" for change
>>> The following page has been changed by HyunsikChoi:
>>> New page:
>>> = The Graph Package (Angrapa) =
>>> The graph package, called Angrapa, is an large-scale graph data management framework
for analytical processing. It is still an ongoing project. It will employ massive parallelism
on Hadoop. It aims to achieve the scalability for processing tera bytes or peta bytes graph
data. Angrapa will be used in a variety of scientific and industrial areas, such as data mining,
machine learning, information retrieval, bioinformatics, and social networks, required to
process large-scale graph data.
>>> = Description =
>>> The graph package is new programming framework for graph processing.
>>> = The Main Goal =
>>>  * Easy APIs familar to graph features
>>>  * Store structure suited to graph data when it comes to considering the connectivity
of graph data
>>>  * Applying data communication method (i.e., BSP) without deterioration of graph
data locality
>> --
>> Best Regards, Edward J. Yoon @ NHN, corp.
> --
> Best Regards, Edward J. Yoon @ NHN, corp.

Best Regards, Edward J. Yoon @ NHN, corp.

View raw message