incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Nour El-Din <nour.moham...@gmail.com>
Subject Re: [VOTE] Giraph to join the incubator
Date Sun, 24 Jul 2011 15:27:08 GMT
+1 (Binding)

On Sat, Jul 23, 2011 at 4:57 AM, Ashish <paliwalashish@gmail.com> wrote:
> +1
>
> On Sat, Jul 23, 2011 at 2:00 AM, Avery Ching <aching@yahoo-inc.com> wrote:
>
>> Hi and good friday to you all,
>>
>> It's been a week since we submitted our proposal for Giraph's inclusion
>> into the Apache incubator and the discussion around the proposal seems to
>> have settled.  Thank you for all the comments/questions/general interest and
>> for those who volunteered to be committers.  At this time, I'd like to ask
>> for a vote.
>>
>> The latest proposal can be found at the end of this email and in the
>> following wiki:
>>
>> http://wiki.apache.org/incubator/GiraphProposal
>>
>> <http://wiki.apache.org/incubator/GiraphProposal>The discussion regarding
>> the proposal can be found below:
>>
>> http://www.mail-archive.com/general@incubator.apache.org/msg29957.html
>>
>> <http://www.mail-archive.com/general@incubator.apache.org/msg29957.html>
>> Please cast your votes:
>>
>> [  ] +1 Accept Giraph for incubation
>> [  ] +0 Indifferent to Giraph incubation
>> [  ] -1 Reject Giraph for incubation
>>
>> This vote will close 72 hours from now.
>>
>> Thanks!
>>
>> Avery
>>
>>
>> = Giraph : Large-scale graph processing on Hadoop =
>>
>> == Abstract ==
>>
>> Giraph is a large-scale, fault-tolerant, Bulk Synchronous Parallel
>> (BSP)-based graph processing framework.
>>
>> == Proposal ==
>>
>> Graph processing platforms to run large-scale algorithms (such as page
>> rank, shared connections, personalization-based popularity, etc.) have
>> become quite popular.  Some recent examples include Pregel and HaLoop.  For
>> general-purpose big data computation, the MapReduce computation model is
>> widely adopted and the most deployed MapReduce infrastructure is Apache
>> Hadoop.  We have implemented a graph-processing framework that is launched
>> as a typical Hadoop MapReduce job to leverage existing Hadoop
>> infrastructure, such as Amazon’s EC2.  Giraph builds upon the graph-oriented
>> nature of Pregel but additionally adds fault-tolerance to the coordinator
>> process with the use of ZooKeeper as its centralized coordination service.
>>  Additionally, Giraph will include a library of generic graph algorithms.
>>
>> == Background ==
>>
>> Giraph was initially began development as a side project at Yahoo! at the
>> end of 2010.  It was made functional in a month and then started adding
>> various features.  Development has been focused on internal customers needs
>> until this point.
>>
>> == Rationale ==
>>
>> Web and online social graphs have been rapidly growing in size and scale
>> during the past decade.  In 2008, Google estimated that the number of web
>> pages reached over a trillion.  Online social networking and email sites,
>> including Yahoo!, Google, Microsoft, Facebook, LinkedIn, and Twitter, have
>> hundreds of millions of users and are expected to grow much more in the
>> future.  Processing these graphs plays a big role in relevant and
>> personalized information for users, such as results from a search engine or
>> news in an online social networking site.
>>
>> == Initial Goals ==
>>
>> At this point, most of the functionality has been implemented and we are
>> looking to get more adoption and contributions from users outside Yahoo!.
>> We want to ensure that performance scales and that the code is robust and
>> fault tolerant.
>>
>> == Current Status ==
>>
>> === Meritocracy ===
>>
>> Giraph was initially developed by Avery Ching and Christian Kunz beginning
>> in December 2010 at Yahoo!.  There are other developers using Giraph at
>> Yahoo! that are making suggestions and adding code.  We are reaching out to
>> other folks at social networking companies for additional usage and
>> development.
>>
>> === Community ===
>>
>> Several groups who are interested in either joining our project or using
>> our code have contacted us.  We certainly believe that there is a lot of
>> interest and are actively looking to improve and expand the community.
>>
>> === Core Developers ===
>>
>>  * Avery Ching: Wrote a majority of the code
>>  * Christian Kunz: Wrote most of the communication code and security
>> integration with Hadoop
>>
>> === Alignment ===
>>
>> Giraph uses several Apache projects as its underlying infrastructure
>> (Hadoop and ZooKeeper).   It also builds on Apache Maven.
>>
>> == Known Risks ==
>>
>> === Orphaned products ===
>>
>> There are many social networking companies that would be interested in
>> using this graph-processing framework and we have already received interest
>> from some of them.  Yahoo! is already using this code in production and will
>> certainly continue to use it in the future as well.
>>
>> === Inexperience with Open Source ===
>>
>> While the initial developers have limited experience on contributing to
>> open-source projects, Yahoo! as a company has a strong commitment to
>> open-source and we have several advisors that we can ask for help.
>>
>> === Homogenous Developers ===
>>
>> At this time, the project is relatively young and the developers work at
>> only two companies (Yahoo! and Jybe).  However, given the interest we have
>> seen in the project, we expect the diversity to improve in the near future.
>>
>> === Reliance on Salaried Developers ===
>>
>> Currently Giraph is being developed by a combination of salaried and
>> volunteer time.  We expect that other corporations will take an interest in
>> this project and likely contribute with salaried developers.  Some
>> individuals will likely spend volunteer time on it as well.  It is still
>> early in their project and we are hoping for a lot of growth.
>>
>> === Relationships with Other Apache Products ===
>>
>> Giraph depends on many Apache projects: Hadoop, ZooKeeper, Log4j, Commons,
>> etc.  It is built using Apache Maven.
>>
>> Giraph has some overlapping functionality with Apache Hama.  However, there
>> are some significant differences.  Giraph focuses on graph-based bulk
>> synchronous parallel (BSP) computing, while Apache Hama is more for general
>> purposed BSP computing.  Giraph runs on the Hadoop infrastructure, while
>> Apache Hama uses its own computing framework.
>>
>> === An Excessive Fascination with the Apache Brand ===
>>
>> The Apache brand is likely to help us find contributors, however, our
>> interests in Apache are primarily because the other projects that we depend
>> on are also Apache projects and it makes sense that all this software be
>> available from the same place.
>>
>> === Documentation ===
>>
>> Currently we have little documentation, but several examples.  We are
>> working on improving this situation.
>>
>> === Initial Source ===
>>
>> The initial source of the code is from Yahoo! and began development in
>> December 2010.  It is already available on GitHub at
>> https://github.com/aching/Giraph.
>>
>> === Source and Intellectual Property Submission Plan ===
>>
>> We intend the entire code base to be licensed under the Apache License,
>> Version 2.0.
>>
>> === External Dependencies ===
>>
>> The required dependencies are all Apache compatible licenses.  The
>> following components with non-Apache licenses are enumerated:
>>  * JSON – Public Domain
>>
>> === Cryptography ===
>>
>> Giraph depends on secure Hadoop that can optionally use Kerberos.
>>
>> == Required Resources ==
>>
>> === Mailing lists ===
>>
>>  * giraph-private (with moderated subscriptions)
>>  * giraph-dev
>>  * giraph-commits
>>  * giraph-users
>>
>> === Subversion Directory ===
>>
>> https://svn.apache.org/repos/asf/incubator/giraph
>>
>> === Issue Tracking ===
>>
>> JIRA Giraph (GIRAPH)
>>
>> === Other Resources ===
>>
>> Giraph has integration tests that can be run with the LocalJobRunner.
>>  These same tests also designed to be run on a small (even single node)
>> Hadoop cluster.  While not required at this time, it would be nice if such a
>> resource were available.
>>
>> === Initial Committers ===
>>
>>  * Avery Ching, aching at yahoo-inc dot com
>>  * Christian Kunz, christian at jybe-inc dot com
>>  * Owen O’Malley, owen at hortonworks dot com
>>  * Phillip Rhodes, prhodes at apache dot org
>>  * Hyunsik Choi, hyunsik at apache dot org
>>  * Jakob Homan, jghoman at apache dot org
>>  * Arun Suresh, asuresh at yahoo-inc dot com
>>
>> === Affiliations ===
>>
>>  * Avery Ching, Yahoo!
>>  * Christian Kunz, Jybe
>>  * Owen O'Malley, Hortonworks
>>  * Phillip Rhodes, Fogbeam Labs
>>  * Hyunsik Choi, Database Lab, Korea University
>>  * Jakob Homan, LinkedIn
>>  * Arun Suresh, Yahoo!
>>
>> == Sponsors ==
>>
>> === Champion ===
>>
>> Owen O’ Malley
>>
>> === Nominated Mentors ===
>>
>> Owen O’Malley
>>
>> === Sponsoring Entity ===
>>
>> Apache Incubator PMC
>>
>>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>



-- 
Thanks
- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
  http://www.redbooks.ibm.com/abstracts/sg247585.html
- LinkedIn: http://www.linkedin.com/in/mnour
- Blog: http://tadabborat.blogspot.com
----
"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message