incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@gmail.com>
Subject Re: [VOTE] Giraph to join the incubator
Date Mon, 25 Jul 2011 08:54:37 GMT
+1

On Jul 24, 2011, at 10:38 PM, Bill Graham <billgraham@gmail.com> wrote:

> +1
>
> On Sun, Jul 24, 2011 at 8:27 AM, Mohammad Nour El-Din <
> nour.mohammad@gmail.com> wrote:
>
>> +1 (Binding)
>>
>> On Sat, Jul 23, 2011 at 4:57 AM, Ashish <paliwalashish@gmail.com>  
>> wrote:
>>> +1
>>>
>>> On Sat, Jul 23, 2011 at 2:00 AM, Avery Ching <aching@yahoo-inc.com>
>> wrote:
>>>
>>>> Hi and good friday to you all,
>>>>
>>>> It's been a week since we submitted our proposal for Giraph's  
>>>> inclusion
>>>> into the Apache incubator and the discussion around the proposal  
>>>> seems
>> to
>>>> have settled.  Thank you for all the comments/questions/general  
>>>> interest
>> and
>>>> for those who volunteered to be committers.  At this time, I'd  
>>>> like to
>> ask
>>>> for a vote.
>>>>
>>>> The latest proposal can be found at the end of this email and in  
>>>> the
>>>> following wiki:
>>>>
>>>> http://wiki.apache.org/incubator/GiraphProposal
>>>>
>>>> <http://wiki.apache.org/incubator/GiraphProposal>The discussion
>> regarding
>>>> the proposal can be found below:
>>>>
>>>> http://www.mail-archive.com/general@incubator.apache.org/msg29957.html
>>>>
>>>> <http://www.mail-archive.com/general@incubator.apache.org/msg29957.html
>>>
>>>> Please cast your votes:
>>>>
>>>> [  ] +1 Accept Giraph for incubation
>>>> [  ] +0 Indifferent to Giraph incubation
>>>> [  ] -1 Reject Giraph for incubation
>>>>
>>>> This vote will close 72 hours from now.
>>>>
>>>> Thanks!
>>>>
>>>> Avery
>>>>
>>>>
>>>> = Giraph : Large-scale graph processing on Hadoop =
>>>>
>>>> == Abstract ==
>>>>
>>>> Giraph is a large-scale, fault-tolerant, Bulk Synchronous Parallel
>>>> (BSP)-based graph processing framework.
>>>>
>>>> == Proposal ==
>>>>
>>>> Graph processing platforms to run large-scale algorithms (such as  
>>>> page
>>>> rank, shared connections, personalization-based popularity, etc.)  
>>>> have
>>>> become quite popular.  Some recent examples include Pregel and  
>>>> HaLoop.
>> For
>>>> general-purpose big data computation, the MapReduce computation  
>>>> model is
>>>> widely adopted and the most deployed MapReduce infrastructure is  
>>>> Apache
>>>> Hadoop.  We have implemented a graph-processing framework that is
>> launched
>>>> as a typical Hadoop MapReduce job to leverage existing Hadoop
>>>> infrastructure, such as Amazon’s EC2.  Giraph builds upon the
>> graph-oriented
>>>> nature of Pregel but additionally adds fault-tolerance to the
>> coordinator
>>>> process with the use of ZooKeeper as its centralized coordination
>> service.
>>>> Additionally, Giraph will include a library of generic graph
>> algorithms.
>>>>
>>>> == Background ==
>>>>
>>>> Giraph was initially began development as a side project at  
>>>> Yahoo! at
>> the
>>>> end of 2010.  It was made functional in a month and then started  
>>>> adding
>>>> various features.  Development has been focused on internal  
>>>> customers
>> needs
>>>> until this point.
>>>>
>>>> == Rationale ==
>>>>
>>>> Web and online social graphs have been rapidly growing in size  
>>>> and scale
>>>> during the past decade.  In 2008, Google estimated that the  
>>>> number of
>> web
>>>> pages reached over a trillion.  Online social networking and email
>> sites,
>>>> including Yahoo!, Google, Microsoft, Facebook, LinkedIn, and  
>>>> Twitter,
>> have
>>>> hundreds of millions of users and are expected to grow much more  
>>>> in the
>>>> future.  Processing these graphs plays a big role in relevant and
>>>> personalized information for users, such as results from a search  
>>>> engine
>> or
>>>> news in an online social networking site.
>>>>
>>>> == Initial Goals ==
>>>>
>>>> At this point, most of the functionality has been implemented and  
>>>> we are
>>>> looking to get more adoption and contributions from users outside
>> Yahoo!.
>>>> We want to ensure that performance scales and that the code is  
>>>> robust
>> and
>>>> fault tolerant.
>>>>
>>>> == Current Status ==
>>>>
>>>> === Meritocracy ===
>>>>
>>>> Giraph was initially developed by Avery Ching and Christian Kunz
>> beginning
>>>> in December 2010 at Yahoo!.  There are other developers using  
>>>> Giraph at
>>>> Yahoo! that are making suggestions and adding code.  We are  
>>>> reaching out
>> to
>>>> other folks at social networking companies for additional usage and
>>>> development.
>>>>
>>>> === Community ===
>>>>
>>>> Several groups who are interested in either joining our project  
>>>> or using
>>>> our code have contacted us.  We certainly believe that there is a  
>>>> lot of
>>>> interest and are actively looking to improve and expand the  
>>>> community.
>>>>
>>>> === Core Developers ===
>>>>
>>>> * Avery Ching: Wrote a majority of the code
>>>> * Christian Kunz: Wrote most of the communication code and security
>>>> integration with Hadoop
>>>>
>>>> === Alignment ===
>>>>
>>>> Giraph uses several Apache projects as its underlying  
>>>> infrastructure
>>>> (Hadoop and ZooKeeper).   It also builds on Apache Maven.
>>>>
>>>> == Known Risks ==
>>>>
>>>> === Orphaned products ===
>>>>
>>>> There are many social networking companies that would be  
>>>> interested in
>>>> using this graph-processing framework and we have already received
>> interest
>>>> from some of them.  Yahoo! is already using this code in  
>>>> production and
>> will
>>>> certainly continue to use it in the future as well.
>>>>
>>>> === Inexperience with Open Source ===
>>>>
>>>> While the initial developers have limited experience on  
>>>> contributing to
>>>> open-source projects, Yahoo! as a company has a strong commitment  
>>>> to
>>>> open-source and we have several advisors that we can ask for help.
>>>>
>>>> === Homogenous Developers ===
>>>>
>>>> At this time, the project is relatively young and the developers  
>>>> work at
>>>> only two companies (Yahoo! and Jybe).  However, given the  
>>>> interest we
>> have
>>>> seen in the project, we expect the diversity to improve in the near
>> future.
>>>>
>>>> === Reliance on Salaried Developers ===
>>>>
>>>> Currently Giraph is being developed by a combination of salaried  
>>>> and
>>>> volunteer time.  We expect that other corporations will take an  
>>>> interest
>> in
>>>> this project and likely contribute with salaried developers.  Some
>>>> individuals will likely spend volunteer time on it as well.  It  
>>>> is still
>>>> early in their project and we are hoping for a lot of growth.
>>>>
>>>> === Relationships with Other Apache Products ===
>>>>
>>>> Giraph depends on many Apache projects: Hadoop, ZooKeeper, Log4j,
>> Commons,
>>>> etc.  It is built using Apache Maven.
>>>>
>>>> Giraph has some overlapping functionality with Apache Hama.   
>>>> However,
>> there
>>>> are some significant differences.  Giraph focuses on graph-based  
>>>> bulk
>>>> synchronous parallel (BSP) computing, while Apache Hama is more for
>> general
>>>> purposed BSP computing.  Giraph runs on the Hadoop  
>>>> infrastructure, while
>>>> Apache Hama uses its own computing framework.
>>>>
>>>> === An Excessive Fascination with the Apache Brand ===
>>>>
>>>> The Apache brand is likely to help us find contributors, however,  
>>>> our
>>>> interests in Apache are primarily because the other projects that  
>>>> we
>> depend
>>>> on are also Apache projects and it makes sense that all this  
>>>> software be
>>>> available from the same place.
>>>>
>>>> === Documentation ===
>>>>
>>>> Currently we have little documentation, but several examples.  We  
>>>> are
>>>> working on improving this situation.
>>>>
>>>> === Initial Source ===
>>>>
>>>> The initial source of the code is from Yahoo! and began  
>>>> development in
>>>> December 2010.  It is already available on GitHub at
>>>> https://github.com/aching/Giraph.
>>>>
>>>> === Source and Intellectual Property Submission Plan ===
>>>>
>>>> We intend the entire code base to be licensed under the Apache  
>>>> License,
>>>> Version 2.0.
>>>>
>>>> === External Dependencies ===
>>>>
>>>> The required dependencies are all Apache compatible licenses.  The
>>>> following components with non-Apache licenses are enumerated:
>>>> * JSON – Public Domain
>>>>
>>>> === Cryptography ===
>>>>
>>>> Giraph depends on secure Hadoop that can optionally use Kerberos.
>>>>
>>>> == Required Resources ==
>>>>
>>>> === Mailing lists ===
>>>>
>>>> * giraph-private (with moderated subscriptions)
>>>> * giraph-dev
>>>> * giraph-commits
>>>> * giraph-users
>>>>
>>>> === Subversion Directory ===
>>>>
>>>> https://svn.apache.org/repos/asf/incubator/giraph
>>>>
>>>> === Issue Tracking ===
>>>>
>>>> JIRA Giraph (GIRAPH)
>>>>
>>>> === Other Resources ===
>>>>
>>>> Giraph has integration tests that can be run with the  
>>>> LocalJobRunner.
>>>> These same tests also designed to be run on a small (even single  
>>>> node)
>>>> Hadoop cluster.  While not required at this time, it would be  
>>>> nice if
>> such a
>>>> resource were available.
>>>>
>>>> === Initial Committers ===
>>>>
>>>> * Avery Ching, aching at yahoo-inc dot com
>>>> * Christian Kunz, christian at jybe-inc dot com
>>>> * Owen O’Malley, owen at hortonworks dot com
>>>> * Phillip Rhodes, prhodes at apache dot org
>>>> * Hyunsik Choi, hyunsik at apache dot org
>>>> * Jakob Homan, jghoman at apache dot org
>>>> * Arun Suresh, asuresh at yahoo-inc dot com
>>>>
>>>> === Affiliations ===
>>>>
>>>> * Avery Ching, Yahoo!
>>>> * Christian Kunz, Jybe
>>>> * Owen O'Malley, Hortonworks
>>>> * Phillip Rhodes, Fogbeam Labs
>>>> * Hyunsik Choi, Database Lab, Korea University
>>>> * Jakob Homan, LinkedIn
>>>> * Arun Suresh, Yahoo!
>>>>
>>>> == Sponsors ==
>>>>
>>>> === Champion ===
>>>>
>>>> Owen O’ Malley
>>>>
>>>> === Nominated Mentors ===
>>>>
>>>> Owen O’Malley
>>>>
>>>> === Sponsoring Entity ===
>>>>
>>>> Apache Incubator PMC
>>>>
>>>>
>>>
>>>
>>> --
>>> thanks
>>> ashish
>>>
>>> Blog: http://www.ashishpaliwal.com/blog
>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>>
>>
>>
>>
>> --
>> Thanks
>> - Mohammad Nour
>>  Author of (WebSphere Application Server Community Edition 2.0 User  
>> Guide)
>>  http://www.redbooks.ibm.com/abstracts/sg247585.html
>> - LinkedIn: http://www.linkedin.com/in/mnour
>> - Blog: http://tadabborat.blogspot.com
>> ----
>> "Life is like riding a bicycle. To keep your balance you must keep  
>> moving"
>> - Albert Einstein
>>
>> "Writing clean code is what you must do in order to call yourself a
>> professional. There is no reasonable excuse for doing anything less
>> than your best."
>> - Clean Code: A Handbook of Agile Software Craftsmanship
>>
>> "Stay hungry, stay foolish."
>> - Steve Jobs
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>> For additional commands, e-mail: general-help@incubator.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message