Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3C45F10148 for ; Thu, 4 Dec 2014 21:59:17 +0000 (UTC) Received: (qmail 57116 invoked by uid 500); 4 Dec 2014 21:59:16 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 56928 invoked by uid 500); 4 Dec 2014 21:59:16 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 56912 invoked by uid 99); 4 Dec 2014 21:59:16 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Dec 2014 21:59:16 +0000 Received: from mail-la0-f42.google.com (mail-la0-f42.google.com [209.85.215.42]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 2B48C1A0019 for ; Thu, 4 Dec 2014 21:59:14 +0000 (UTC) Received: by mail-la0-f42.google.com with SMTP id gd6so1627471lab.15 for ; Thu, 04 Dec 2014 13:59:11 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.112.155.72 with SMTP id vu8mr11871465lbb.30.1417730351367; Thu, 04 Dec 2014 13:59:11 -0800 (PST) Received: by 10.112.10.16 with HTTP; Thu, 4 Dec 2014 13:59:11 -0800 (PST) In-Reply-To: References: Date: Thu, 4 Dec 2014 22:59:11 +0100 Message-ID: Subject: Re: [DISCUSS] [PROPOSAL] SAMOA for Apache Incubator From: jan i To: "general@incubator.apache.org" Content-Type: multipart/alternative; boundary=089e01160fe2fb0ef205096b117d --089e01160fe2fb0ef205096b117d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thursday, December 4, 2014, Henry Saputra wrote: > I was not saying to stop it but Daniel needs to send request to > private@ list make sure he is part of IPMCs. Agreed, sorry if my wording sounded differently. rgds jan i > > - Henry > > On Thu, Dec 4, 2014 at 12:48 PM, jan i > > wrote: > > On Thursday, December 4, 2014, Henry Saputra > > > wrote: > > > >> Daniel, > >> > >> Small formality, as I remember champion has to be member of IPMC in > >> which you are not. > >> Since you are a member of ASF, you can just send email to > >> private@incubator.a.o to be added as member of IPMC. > > > > > > I agree we need formalities to be in order, but in this case its really > > only paperwork, so lets not stop the process for the project. > > > > just my opinion. > > rgds > > jan i > > > >> > >> - Henry > >> > >> > >> On Tue, Dec 2, 2014 at 9:27 PM, Daniel Dai > >> > wrote: > >> > Hi, > >> > > >> > I would like to propose SAMOA as an Apache Incubator project. > >> > https://wiki.apache.org/incubator/SAMOAProposal > >> > > >> > I've posted posted the text of the proposal below: > >> > > >> > Thanks, > >> > Daniel > >> > > >> > =3D SAMOA =3D > >> > =3D=3D Abstract =3D=3D > >> > SAMOA is an an open-source platform for mining big data streams. > >> > > >> > =3D=3D Proposal =3D=3D > >> > SAMOA provides a collection of distributed streaming algorithms for > the > >> > most common data mining and machine learning tasks such as > >> classification, > >> > clustering, and regression, as well as programming abstractions to > >> develop > >> > new algorithms that run on top of distributed stream processing > engines > >> > (DSPEs). It features a pluggable architecture that allows it to run = on > >> > several DSPEs such as Apache Storm, Apache S4, and Apache Samza. > >> > > >> > =3D=3D Background =3D=3D > >> > Hadoop and its ecosystem have changed the way data are processed by > >> > allowing to push algorithms to unprecedented scale. As an example, > Mahout > >> > allows to run data mining and machine learning algorithms on very > large > >> > datasets. However, Hadoop and Mahout are not suited to handle > streaming > >> > data. Simply put, the goal of SAMOA is to provide a streaming > counterpart > >> > to Mahout. > >> > > >> > =3D=3D Rationale =3D=3D > >> > SAMOA aims to fill the current gap in tools for mining large scale > >> streams. > >> > Many organizations can benefit from a scalable stream mining platfor= m > >> > system such as SAMOA. > >> > > >> > SAMOA is a natural fit for the Apache Software Foundation. It is > licensed > >> > under the ASL v2.0. It already interoperates with several existing > Apache > >> > projects such as Storm, S4, and Samza. Furthermore, it is > complementary > >> to > >> > existing Apache projects such as Mahout. The initial committers are > >> > familiar with the Apache process and subscribes to the Apache missio= n. > >> > Indeed, the team includes multiple Apache committers. Finally, joini= ng > >> > Apache will help coordinate the development effort of the growing > number > >> of > >> > organizations which contribute to SAMOA. > >> > > >> > =3D=3D Initial Goals =3D=3D > >> > * Move the existing codebase to Apache > >> > * Integrate with the Apache development process > >> > * Incremental development and releases per Apache guidelines > >> > > >> > =3D=3D Current Status =3D=3D > >> > SAMOA started as a research project at Yahoo Labs in 2013 and was > >> > open-sourced in October the same year. It has been under development > on > >> > Yahoo's public GitHub repository since being open-sourced. It has > >> undergone > >> > two releases (0.1, 0.2). > >> > > >> > =3D=3D=3D Meritocracy =3D=3D=3D > >> > The SAMOA project already operates on meritocratic principles. Today= , > >> SAMOA > >> > has several developers and has accepted multiple patches from outsid= e > of > >> > Yahoo Labs. However, our intent with this incubator proposal is to > start > >> > building a more diverse developer community around SAMOA that follow= s > the > >> > Apache meritocracy model. We will identify all committers and PPMC > >> members > >> > for the project operating under the ASF meritocratic principles. We > plan > >> to > >> > continue support for new contributors and work with those who > contribute > >> > significantly to the project to make them committers. > >> > > >> > =3D=3D=3D Community =3D=3D=3D > >> > SAMOA is currently being used internally at Yahoo. Acceptance into t= he > >> > Apache foundation would bolster the existing user and developer > community > >> > around SAMOA. That community includes contributors from several > >> > institutions, active mostly on GitHub's pages. SAMOA has been starre= d > >> more > >> > than 300 times and forked more than 50 times on GitHub as of Novembe= r > >> 2014. > >> > > >> > =3D=3D=3D Core Developers =3D=3D=3D > >> > The core developers are a diverse group, many of which already very > >> > experienced with open source. There are two existing Apache > committers, > >> > along with people from various companies and universities. > >> > > >> > =3D=3D=3D Alignment =3D=3D=3D > >> > The ASF is the natural choice to host SAMOA. First, its goal of > >> encouraging > >> > community-driven open-source projects fits with our vision for SAMOA= . > >> > Additionally, many other projects that SAMOA is based on, such as > Apache > >> > Storm, S4, Samza, and HDFS, are hosted by the ASF. Close proximity o= f > >> SAMOA > >> > to these projects within the ASF will provide mutual benefit. > >> > > >> > =3D=3D Known Risks =3D=3D > >> > =3D=3D=3D Orphaned Products =3D=3D=3D > >> > Given the current level of investment in SAMOA the risk of the proje= ct > >> > being abandoned is minimal. There are several constituents who are > highly > >> > incentivized to continue development, and Yahoo Labs relies on SAMOA > as a > >> > platform for a large number of long-term research projects. However, > the > >> > small number of initial committers might be a concern. We plan to > address > >> > this issue during incubation by growing the community and the number > of > >> > committers. > >> > > >> > =3D=3D=3D Inexperience with Open Source =3D=3D=3D > >> > SAMOA has existed as a healthy open source project for one year. > During > >> > this time, we have curated an open-source community successfully, > >> > attracting developers from a diverse group of universities and > companies > >> > including Huawei, Yahoo, University of Porto, and Universitat > Politecnica > >> > de Catalunya. > >> > > >> > Gianmarco is a committer for Apache Pig, Matthieu for Apache S4. > Albert > >> is > >> > one of the lead developers of MOA, an open-source tool for streaming > >> > machine learning. > >> > > >> > =3D=3D=3D Homogenous Developers =3D=3D=3D > >> > The initial list of committers includes developers from several > >> > institutions, both academic and industrial. The committers are > >> > geographically distributed across Europe, America, and Asia. > >> > > >> > =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > >> > Like most open source projects, SAMOA receives a substantial support > from > >> > salaried developers. In addition, those working from within > corporations > >> > often devote =E2=80=9Cafter hours=E2=80=9D or spare time in the proj= ect - and these > come > >> > from several organizations. We will work to ensure the ability for t= he > >> > project to continuously be stewarded and to proceed forward > independently > >> > of salaried developers. > >> > > >> > =3D=3D=3D Relationship with Other Apache Products =3D=3D=3D > >> > SAMOA interoperates with several existing Apache project, mainly by > using > >> > them as stream processing engines: Apache Storm, Apache S4, and Apac= he > >> > Samza. It also uses several other Apache components, including Apach= e > >> Maven > >> > and several Apache Commons libraries. > >> > > >> > =3D=3D=3D A Excessive Fascination with the Apache Brand =3D=3D=3D > >> > SAMOA is already a healthy and relatively well known open source > project. > >> > This proposal is not for the purpose of generating publicity. Rather= , > the > >> > primary benefits to joining Apache are those outlined in the Rationa= le > >> > section. We are more interested in establishing a strong community > that > >> can > >> > drive the project independently of Yahoo. > >> > > >> > =3D=3D Documentation =3D=3D > >> > The reader will find these websites relevant: > >> > > >> > * SAMOA website: http://samoa-project.net/ > >> > * SAMOA documentation: https://github.com/yahoo/samoa/wiki/ > >> > * Issue tracking: https://github.com/yahoo/samoa/issues > >> > * Codebase: https://github.com/yahoo/samoa > >> > * User group: http://groups.google.com/group/samoa-user > >> > > >> > =3D=3D Initial Source =3D=3D > >> > The SAMOA codebase is currently hosted on GitHub: > >> > https://github.com/yahoo/samoa. This is the exact codebase that we > would > >> > migrate to the Apache foundation. > >> > > >> > =3D=3D Source and Intellectual Property Submission Plan =3D=3D > >> > Currently, the SAMOA codebase is distributed under an Apache license > >> v2.0. > >> > The vast majority of code has copyright held by Yahoo. Upon entering > the > >> > Incubator, Yahoo will grant a license to the Apache foundation. In > >> certain > >> > cases where individuals or organizations hold copyright, we will > ensure > >> > they grant a license to the Apache foundation. Going forward, all > commits > >> > will be licensed directly to the Apache foundation through our signe= d > >> > Individual Contributor License Agreements for all committers on the > >> project. > >> > > >> > =3D=3D Cryptography =3D=3D > >> > We do not expect SAMOA to be a controlled export item due to the use > of > >> > encryption. > >> > > >> > =3D=3D External Dependencies =3D=3D > >> > To the best of our knowledge, all dependencies of SAMOA are > distributed > >> > under Apache compatible licenses. Upon acceptance to the incubator, = we > >> > would begin a thorough analysis of all transitive dependencies to > verify > >> > this fact and introduce license checking into the build and release > >> process > >> > (for instance integrating Apache Rat). > >> > > >> > =3D=3D Required Resources =3D=3D > >> > =3D=3D=3D Mailing Lists =3D=3D=3D > >> > We will migrate the existing SAMOA mailing lists as follows: > >> > > >> > * samoa-users@googlegroups --> users@samoa.incubator.apache.org > > >> > >> > * samoa-developers@googlegroups --> dev@samoa.incubator.apache.org > > >> > >> > > >> > SAMOA commits are hosted on GitHub, so we would request the followin= g > >> > mailing list: > >> > > >> > * commits@samoa.incubator.apache.org > >> > > >> > We would also request the following mailing list: > >> > > >> > * private@samoa.incubator.apache.org > (with moderated > >> subscription) > >> > > >> > =3D=3D=3D Source control =3D=3D=3D > >> > The SAMOA team would like to use Git for source control, due to our > >> current > >> > use of Git. We request a writeable Git repo for SAMOA, and mirroring > to > >> be > >> > set up to GitHub through INFRA. > >> > > >> > * https://git-wip-us.apache.org/repos/asf/incubator-samoa.git > >> > > >> > =3D=3D=3D Issue Tracking =3D=3D=3D > >> > SAMOA currently uses GitHub for issue tracking. We will migrate to t= he > >> > Apache JIRA instance. http://issues.apache.org/jira/browse/SAMOA > >> > > >> > =3D=3D Initial Committers & Affiliations =3D=3D > >> > * Albert Bifet, Huawei, > >> > * Gianmarco De Francisci Morales, Yahoo Labs, > >> > * Nicolas Kourtellis, Yahoo Labs, > >> > * Matthieu Morel, Yahoo Labs, > >> > * Arinto Murdopo, Living Analytics Research Centre, dot > >> edu > >> > dot sg> > >> > * Olivier Van Laere, BlueShift Labs, > >> > > >> > =3D=3D Sponsors =3D=3D > >> > =3D=3D=3D Champion =3D=3D=3D > >> > * Daniel Dai > >> > > >> > =3D=3D=3D Nominated Mentors =3D=3D=3D > >> > * Alan Gates > >> > * Ted Dunning > >> > * Ashutosh Chauhan > >> > * Enis Soztutar > >> > > >> > =3D=3D=3D Sponsoring Entity =3D=3D=3D > >> > The Apache Incubator > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > > >> > >> For additional commands, e-mail: general-help@incubator.apache.org > > >> > >> > >> > > > > -- > > Sent from My iPad, sorry for any misspellings. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org > > For additional commands, e-mail: general-help@incubator.apache.org > > > --=20 Sent from My iPad, sorry for any misspellings. --089e01160fe2fb0ef205096b117d--