incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naresh Agarwal <naresh.agar...@inmobi.com>
Subject Re: [VOTE] accept SAMOA into incubator
Date Sun, 14 Dec 2014 11:10:23 GMT
+1 (non-binding)

Thanks
Naresh

On Fri, Dec 12, 2014 at 7:22 AM, John D. Ament <john.d.ament@gmail.com>
wrote:
>
> +1 binding
>
> On Thu Dec 11 2014 at 5:10:50 PM Konstantin Boudnik <cos@apache.org>
> wrote:
>
> > +1 (binding).
> >
> > I small comment: we don't do users@ list of podlings, do we? If so
> >  samoa-users@googlegroups --> users@samoa.incubator.apache.org
> > will need to be converged into dev@.
> >
> >
> Not all podlings use a users@, but they can if they like.  Usually if it's
> coming from an established community there will be one.
>
>
> > Cos
> >
> > On Thu, Dec 11, 2014 at 10:02AM, Daniel Dai wrote:
> > > Following the discussion earlier, I'm calling a vote to accept SAMOA
> as a
> > > new Incubator project.
> > >
> > > [ ] +1 Accept SAMOA into the Incubator
> > > [ ] +0 Indifferent to the acceptance of SAMOA
> > > [ ] -1 Do not accept SAMOA because ...
> > >
> > > The vote will be open for at least 72h and closes at the earliest on
> Dec
> > 14
> > > 19:00 GMT.
> > >
> > > https://wiki.apache.org/incubator/SAMOAProposal
> > >
> > > Thanks,
> > > Daniel
> > >
> > > = SAMOA =
> > > == Abstract ==
> > > SAMOA is an an open-source platform for mining big data streams.
> > >
> > > == Proposal ==
> > > SAMOA provides a collection of distributed streaming algorithms for the
> > > most common data mining and machine learning tasks such as
> > classification,
> > > clustering, and regression, as well as programming abstractions to
> > develop
> > > new algorithms that run on top of distributed stream processing engines
> > > (DSPEs). It features a pluggable architecture that allows it to run on
> > > several DSPEs such as Apache Storm, Apache S4, and Apache Samza.
> > >
> > > == Background ==
> > > Hadoop and its ecosystem have changed the way data are processed by
> > > allowing to push algorithms to unprecedented scale. As an example,
> Mahout
> > > allows to run data mining and machine learning algorithms on very large
> > > datasets. However, Hadoop and Mahout are not suited to handle streaming
> > > data. Simply put, the goal of SAMOA is to provide a streaming
> counterpart
> > > to Mahout.
> > >
> > > == Rationale ==
> > > SAMOA aims to fill the current gap in tools for mining large scale
> > streams.
> > > Many organizations can benefit from a scalable stream mining platform
> > > system such as SAMOA.
> > >
> > > SAMOA is a natural fit for the Apache Software Foundation. It is
> licensed
> > > under the ASL v2.0. It already interoperates with several existing
> Apache
> > > projects such as Storm, S4, and Samza. Furthermore, it is complementary
> > to
> > > existing Apache projects such as Mahout. The initial committers are
> > > familiar with the Apache process and subscribes to the Apache mission.
> > > Indeed, the team includes multiple Apache committers. Finally, joining
> > > Apache will help coordinate the development effort of the growing
> number
> > of
> > > organizations which contribute to SAMOA.
> > >
> > > == Initial Goals ==
> > > * Move the existing codebase to Apache
> > > * Integrate with the Apache development process
> > > * Incremental development and releases per Apache guidelines
> > >
> > > == Current Status ==
> > > SAMOA started as a research project at Yahoo Labs in 2013 and was
> > > open-sourced in October the same year. It has been under development on
> > > Yahoo's public GitHub repository since being open-sourced. It has
> > undergone
> > > two releases (0.1, 0.2).
> > >
> > > === Meritocracy ===
> > > The SAMOA project already operates on meritocratic principles. Today,
> > SAMOA
> > > has several developers and has accepted multiple patches from outside
> of
> > > Yahoo Labs. However, our intent with this incubator proposal is to
> start
> > > building a more diverse developer community around SAMOA that follows
> the
> > > Apache meritocracy model. We will identify all committers and PPMC
> > members
> > > for the project operating under the ASF meritocratic principles. We
> plan
> > to
> > > continue support for new contributors and work with those who
> contribute
> > > significantly to the project to make them committers.
> > >
> > > === Community ===
> > > SAMOA is currently being used internally at Yahoo. Acceptance into the
> > > Apache foundation would bolster the existing user and developer
> community
> > > around SAMOA. That community includes contributors from several
> > > institutions, active mostly on GitHub's pages. SAMOA has been starred
> > more
> > > than 300 times and forked more than 50 times on GitHub as of November
> > 2014.
> > >
> > > === Core Developers ===
> > > The core developers are a diverse group, many of which already very
> > > experienced with open source. There are two existing Apache committers,
> > > along with people from various companies and universities.
> > >
> > > === Alignment ===
> > > The ASF is the natural choice to host SAMOA. First, its goal of
> > encouraging
> > > community-driven open-source projects fits with our vision for SAMOA.
> > > Additionally, many other projects that SAMOA is based on, such as
> Apache
> > > Storm, S4, Samza, and HDFS, are hosted by the ASF. Close proximity of
> > SAMOA
> > > to these projects within the ASF will provide mutual benefit.
> > >
> > > == Known Risks ==
> > > === Orphaned Products ===
> > > Given the current level of investment in SAMOA the risk of the project
> > > being abandoned is minimal. There are several constituents who are
> highly
> > > incentivized to continue development, and Yahoo Labs relies on SAMOA
> as a
> > > platform for a large number of long-term research projects. However,
> the
> > > small number of initial committers might be a concern. We plan to
> address
> > > this issue during incubation by growing the community and the number of
> > > committers.
> > >
> > > === Inexperience with Open Source ===
> > > SAMOA has existed as a healthy open source project for one year. During
> > > this time, we have curated an open-source community successfully,
> > > attracting developers from a diverse group of universities and
> companies
> > > including Huawei, Yahoo, University of Porto, and Universitat
> Politecnica
> > > de Catalunya.
> > >
> > > Gianmarco is a committer for Apache Pig, Matthieu for Apache S4. Albert
> > is
> > > one of the lead developers of MOA, an open-source tool for streaming
> > > machine learning.
> > >
> > > === Homogenous Developers ===
> > > The initial list of committers includes developers from several
> > > institutions, both academic and industrial. The committers are
> > > geographically distributed across Europe, America, and Asia.
> > >
> > > === Reliance on Salaried Developers ===
> > > Like most open source projects, SAMOA receives a substantial support
> from
> > > salaried developers. In addition, those working from within
> corporations
> > > often devote “after hours” or spare time in the project - and these
> come
> > > from several organizations. We will work to ensure the ability for the
> > > project to continuously be stewarded and to proceed forward
> independently
> > > of salaried developers.
> > >
> > > === Relationship with Other Apache Products ===
> > > SAMOA interoperates with several existing Apache projects, mainly by
> > using
> > > them as stream processing engines: Apache Storm, Apache S4, and Apache
> > > Samza. It is a counterpart of Apache Mahout for streaming. It also uses
> > > several other Apache components, including Apache Maven and several
> > Apache
> > > Commons libraries.
> > >
> > > === A Excessive Fascination with the Apache Brand ===
> > > SAMOA is already a healthy and relatively well known open source
> project.
> > > This proposal is not for the purpose of generating publicity. Rather,
> the
> > > primary benefits to joining Apache are those outlined in the Rationale
> > > section. We are more interested in establishing a strong community that
> > can
> > > drive the project independently of Yahoo.
> > >
> > > == Documentation ==
> > > The reader will find these websites relevant:
> > >
> > > * SAMOA website: http://samoa-project.net/
> > > * SAMOA documentation: https://github.com/yahoo/samoa/wiki/
> > > * Issue tracking: https://github.com/yahoo/samoa/issues
> > > * Codebase: https://github.com/yahoo/samoa
> > > * User group: http://groups.google.com/group/samoa-user
> > >
> > > == Initial Source ==
> > > The SAMOA codebase is currently hosted on GitHub:
> > > https://github.com/yahoo/samoa. This is the exact codebase that we
> would
> > > migrate to the Apache foundation.
> > >
> > > == Source and Intellectual Property Submission Plan ==
> > > Currently, the SAMOA codebase is distributed under an Apache license
> > v2.0.
> > > The vast majority of code has copyright held by Yahoo. Upon entering
> the
> > > Incubator, Yahoo will grant a license to the Apache foundation. In
> > certain
> > > cases where individuals or organizations hold copyright, we will ensure
> > > they grant a license to the Apache foundation. Going forward, all
> commits
> > > will be licensed directly to the Apache foundation through our signed
> > > Individual Contributor License Agreements for all committers on the
> > project.
> > >
> > > == Cryptography ==
> > > We do not expect SAMOA to be a controlled export item due to the use of
> > > encryption.
> > >
> > > == External Dependencies ==
> > > To the best of our knowledge, all dependencies of SAMOA are distributed
> > > under Apache compatible licenses. Upon acceptance to the incubator, we
> > > would begin a thorough analysis of all transitive dependencies to
> verify
> > > this fact and introduce license checking into the build and release
> > process
> > > (for instance integrating Apache Rat).
> > >
> > > == Required Resources ==
> > > === Mailing Lists ===
> > > We will migrate the existing SAMOA mailing lists as follows:
> > >
> > > * samoa-users@googlegroups --> users@samoa.incubator.apache.org
> > > * samoa-developers@googlegroups --> dev@samoa.incubator.apache.org
> > >
> > > SAMOA commits are hosted on GitHub, so we would request the following
> > > mailing list:
> > >
> > > * commits@samoa.incubator.apache.org
> > >
> > > We would also request the following mailing list:
> > >
> > > * private@samoa.incubator.apache.org (with moderated subscription)
> > >
> > > === Source control ===
> > > The SAMOA team would like to use Git for source control, due to our
> > current
> > > use of Git. We request a writeable Git repo for SAMOA, and mirroring to
> > be
> > > set up to GitHub through INFRA.
> > >
> > > * https://git-wip-us.apache.org/repos/asf/incubator-samoa.git
> > >
> > > === Issue Tracking ===
> > > SAMOA currently uses GitHub for issue tracking. We will migrate to the
> > > Apache JIRA instance. http://issues.apache.org/jira/browse/SAMOA
> > >
> > > == Initial Committers & Affiliations ==
> > > * Albert Bifet, Huawei, <abifet at waikato dot ac dot nz>
> > > * Gianmarco De Francisci Morales, Yahoo Labs, <gdfm at apache dot org>
> > > * Nicolas Kourtellis, Yahoo Labs, <nkourtellis at gmail dot com>
> > > * Matthieu Morel, Yahoo Labs, <mmorel at apache dot org>
> > > * Arinto Murdopo, Living Analytics Research Centre, <arintom at smu dot
> > edu
> > > dot sg>
> > > * Olivier Van Laere, BlueShift Labs, <olivier at getblueshift dot com>
> > >
> > > == Sponsors ==
> > > === Champion ===
> > > * Daniel Dai <daijy at apache dot org>
> > >
> > > === Nominated Mentors ===
> > > * Alan Gates <gates at apache dot org>
> > > * Ted Dunning <tdunning at apache dot org>
> > > * Ashutosh Chauhan <hashutosh at apache dot org>
> > > * Enis Soztutar <enis at apache dot org>
> > >
> > > === Sponsoring Entity ===
> > > The Apache Incubator
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > For additional commands, e-mail: general-help@incubator.apache.org
> >
> >
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message