Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 739BB2009F3 for ; Fri, 20 May 2016 17:08:31 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 71FC5160A0E; Fri, 20 May 2016 15:08:31 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 693F91609B1 for ; Fri, 20 May 2016 17:08:30 +0200 (CEST) Received: (qmail 88176 invoked by uid 500); 20 May 2016 15:08:29 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 88164 invoked by uid 99); 20 May 2016 15:08:28 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 May 2016 15:08:28 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 48B94C009F for ; Fri, 20 May 2016 15:08:28 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.72 X-Spam-Level: X-Spam-Status: No, score=-0.72 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=occamsmachete-com.20150623.gappssmtp.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id M0J_4bCdUrg3 for ; Fri, 20 May 2016 15:08:24 +0000 (UTC) Received: from mail-pa0-f46.google.com (mail-pa0-f46.google.com [209.85.220.46]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 3A9D35F295 for ; Fri, 20 May 2016 15:08:24 +0000 (UTC) Received: by mail-pa0-f46.google.com with SMTP id qo8so40813734pab.1 for ; Fri, 20 May 2016 08:08:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=occamsmachete-com.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=cAgBUohYPBYgZ2sB/fltorhPxjpkafEGy6EKgg6+dgY=; b=AZlD9YtIxfRZsqAZM6NOyPLXIU1gCNpnz37WYuTtbMBm/4ptfCz8VnU7uR3W9W9Iei FFJ7w0VCm53Ije1JxpNqH/RyBNf5CrjGKYn3mbIbwSPZfCfkyg5Jm8I9qHuWIEcN7jxU BS8VTLjtuGQyPpilZV70YdNc2PVB7NksoNFsO9paEqIm3IzkpNg/niGDkMjmDPMhChq5 g448OQa8mxYx6GYvaQBHZDovLssyjSYaaRbetkpj2zEvZMooRu3DtCH9uIulXIf3FAPw 0O9rDajV9Yv+DHtjHA6L9xNjtAsVIfREzL4CwTrHBtTuVcv8WiemXuq4n3w1NWZZA/WE vqrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=cAgBUohYPBYgZ2sB/fltorhPxjpkafEGy6EKgg6+dgY=; b=b6TuxuFX0hVMXw/AY2LVxvOPZAnX5UkhXpMobegnqT+0c4VqVZ3vE0aYqKTZBSooM2 Oc5Cg2VoquKqM55tpnRngM2m9KSKcVIiD0K4pMLuqlOcwTDBdPLmOcaiZ7eufOMzaCmW esFkIUS3RMp3OmbOEG/qiOdWaktNy4FwLL0wTXEeZFgxutmUMvYq9TSNqQuCuV36UthL sQ2K8rLlTC9IOMX8Nr34yC/CkP5rDQoFTbqHDJC8XjIsB82zTUtI6iE0ihOPGlcSUA3k VrfTJ06Y7F2iXRxWbb2ytOkjBuSAzfEi0GIwaBB7puUcisTS8KBKRZ9YURMfqfnSs6E/ 2lsA== X-Gm-Message-State: AOPr4FUr3PQXKbM9UpJXP/Y1ku1yWHYu7Y/DA/ZCt+BQOSAg/cfktmlffP1IJDD12L3ZOQ== X-Received: by 10.66.250.132 with SMTP id zc4mr5657273pac.130.1463756896723; Fri, 20 May 2016 08:08:16 -0700 (PDT) Received: from [192.168.0.2] ([24.19.48.23]) by smtp.gmail.com with ESMTPSA id o2sm27949626pfi.85.2016.05.20.08.08.14 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 20 May 2016 08:08:14 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: [DISCUSS] PredictionIO incubation proposal From: Pat Ferrel In-Reply-To: Date: Fri, 20 May 2016 08:08:13 -0700 Cc: lresende@apache.org, Andrew Purtell , dusenberrymw@apache.org, Simon Chan Content-Transfer-Encoding: quoted-printable Message-Id: References: <1463513123.10590.ezmlm@incubator.apache.org> To: general@incubator.apache.org X-Mailer: Apple Mail (2.3124) archived-at: Fri, 20 May 2016 15:08:31 -0000 It=E2=80=99s great to see such interest and I=E2=80=99m sure the rest of = the podling would agree that the more the better. I also agree with = Suneel, people who know PIO should be given a short bit of time to get = organized before we do the desired expansion. There will be lots of room = to contribute, in any case. For instance try creating a template, no = better way to learn the project. On May 19, 2016, at 9:16 PM, Suneel Marthi wrote: I definitely have concerns about too many folks becoming initial = committers and bringing their own corporate agendas to this project. I suggest that first we vote PIO into incubator then bring in those less experienced with the project. We have a good start with people who have worked on the project from several orgs. Let us get organized first and then bring in new people. I sincerely feel that this is getting real murky with too many cooks = with their own agendas. The lesser external integration points to PIO the = better the project would evolve. My 2 cents. On Thu, May 19, 2016 at 9:03 PM, Andrew Purtell = wrote: > Hi Nick, >=20 > Unless there are any concerns or objections, I will add you and Mr. > Dusenberry to the proposal as initial committers tomorrow. >=20 > Everyone, >=20 > As it seems that discussion has died down I plan to start a VOTE = thread on > this coming Monday. >=20 > Thank you for the comment and attention thus far. >=20 >=20 > On Tue, May 17, 2016 at 12:58 PM, Nick Pentreath = >=20 > wrote: >=20 >> Hi there >>=20 >> I'm glad to see the proposal to incubate PredictionIO. In my previous > life >> as a startup co-founder, I kept a close eye on the project, and it = would > be >> fantastic to see it become an Apache incubating project! >>=20 >> The folks working on Apache Spark and Apache SystemML (incubating) = here > at >> IBM are excited about the possibilities for integrating PredictionIO = and >> SystemML (Mike Dusenberry is a committer on that project), as well >> as further improving Spark integration (I'm a PMC member on that > project). >>=20 >> Mike and I, together with Luciano (who is a mentor on this proposal) > would >> like to volunteer our services as initial committers, if that is > agreeable. >>=20 >> Kind regards >> Nick >> mlnick@apache.org >>=20 >>=20 >>=20 >>>=20 >>> ---------- Forwarded message ---------- >>> From: Andrew Purtell >>> To: "general@incubator.apache.org" >>> Cc: >>> Date: Fri, 13 May 2016 13:41:38 -0700 >>> Subject: [DISCUSS] PredictionIO incubation proposal >>> Greetings, >>>=20 >>> It is my pleasure to >>> =E2=80=8B =E2=80=8B >>> propose the PredictionIO project for incubation at the Apache = Software >>> Foundation. >>> =E2=80=8B =E2=80=8B >>> PredictionIO is a >>> =E2=80=8B popular=E2=80=8B >>> open >>> =E2=80=8B =E2=80=8B >>> source Machine Learning Server built on top of a state-of-the-art = open >>> source stack, including several Apache technologies, that >>> =E2=80=8B =E2=80=8B >>> enables developers to manage and deploy production-ready predictive >>> services for various kinds of machine learning tasks >>> =E2=80=8B, with more than 400 production deployments around the = world and a >> growing >>> contributor community. =E2=80=8B >>>=20 >>>=20 >>> The text of the proposal is included below and is also available at >>> https://wiki.apache.org/incubator/PredictionIO >>>=20 >>> Best regards, >>> Andrew Purtell >>>=20 >>>=20 >>> =3D PredictionIO Proposal =3D >>>=20 >>> =3D=3D=3D Abstract =3D=3D=3D >>> PredictionIO is an open source Machine Learning Server built on top = of >>> state-of-the-art open source stack, that enables developers to = manage > and >>> deploy production-ready predictive services for various kinds of > machine >>> learning tasks. >>>=20 >>> =3D=3D=3D Proposal =3D=3D=3D >>> The PredictionIO platform consists of the following components: >>>=20 >>> * PredictionIO framework - provides the machine learning stack for >>> building, evaluating and deploying engines with machine learning >>> algorithms. It uses Apache Spark for processing. >>>=20 >>> * Event Server - the machine learning analytics layer for unifying >> events >>> from multiple platforms. It can use Apache HBase or any JDBC = backends >>> as its data store. >>>=20 >>> The PredictionIO community also maintains a >>> =E2=80=8B =E2=80=8B >>> Template Gallery, a place to >>> publish and download (free or proprietary) engine templates for > different >>> types of machine learning applications, and is a complemental part = of > the >>> project. At this point we exclude the Template Gallery from the > proposal, >>> as it has a separate set of contributors and we=E2=80=99re not = familiar with an >>> Apache approved mechanism to maintain such a gallery. >>>=20 >>> You can find the Template Gallery at = https://templates.prediction.io/ >>>=20 >>> =3D=3D=3D Background =3D=3D=3D >>> PredictionIO was started with a mission to democratize and bring > machine >>> learning to the masses. >>>=20 >>> Machine learning has traditionally been a luxury for big companies = like >>> Google, Facebook, and Netflix. There are ML libraries and tools = lying >>> around the internet but the effort of putting them all together as a >>> production-ready infrastructure is a very resource-intensive task = that > is >>> remotely reachable by individuals or small businesses. >>>=20 >>> PredictionIO is a production-ready, full stack machine learning = system >> that >>> allows organizations of any scale to quickly deploy machine learning >>> capabilities. It comes with official and community-contributed = machine >>> learning engine templates that are easy to customize. >>>=20 >>> =3D=3D=3D Rationale =3D=3D=3D >>> As usage and number of contributors to PredictionIO has grown bigger > and >>> more diverse, we have sought for an independent framework for the > project >>> to keep thriving. We believe the Apache foundation is a great fit. >> Joining >>> Apache would ensure that tried and true processes and procedures are = in >>> place for the growing number of organizations interested in > contributing >>> to PredictionIO. PredictionIO is also a good fit for the Apache >> foundation. >>> PredictionIO was built on top of several Apache projects (HBase, = Spark, >>> Hadoop). We are familiar with the Apache process and believe that = the >>> democratic and meritocratic nature of the foundation aligns with the >>> project goals. >>>=20 >>> =3D=3D=3D Initial Goals =3D=3D=3D >>> The initial milestones will be to move the existing codebase to = Apache >> and >>> integrate with the Apache development process. Once this is > accomplished, >>> we plan for incremental development and releases that follow the = Apache >>> guidelines, as well as growing our developer and user communities. >>>=20 >>> =3D=3D=3D Current Status =3D=3D=3D >>> PredictionIO has undergone nine minor releases and many patches. >>> PredictionIO is being used in production by Salesforce.com as well = as >> many >>> other organizations and apps. The PredictionIO codebase is currently >>> hosted at GitHub, which will form the basis of the Apache git > repository. >>>=20 >>> =3D=3D=3D=3D Meritocracy =3D=3D=3D=3D >>> We plan to invest in supporting a meritocracy. We will discuss the >>> requirements in an open forum. We intend to invite additional > developers >>> to participate. We will encourage and monitor community = participation > so >>> that privileges can be extended to those that contribute. >>>=20 >>> =3D=3D=3D=3D Community =3D=3D=3D=3D >>> Acceptance into the Apache foundation would bolster the already = strong >>> user and developer community around PredictionIO. That community > includes >>> many contributors from various other companies, and an active = mailing >> list >>> composed of hundreds of users. >>>=20 >>> =3D=3D=3D=3D Core Developers =3D=3D=3D=3D >>> The core developers of our project are listed in our contributors = and >>> initial PPMC below. Though many are employed at Salesforce.com, = there > are >>> also engineers from ActionML, and independent developers. >>>=20 >>> =3D=3D=3D Alignment =3D=3D=3D >>> The ASF is the natural choice to host the PredictionIO project as = its >> goal >>> is democratizing Machine Learning by making it more easily = accessible > to >>> every user/developer. PredictionIO is built on top of several top = level >>> Apache projects as outlined above. >>>=20 >>> =3D=3D=3D Known Risks =3D=3D=3D >>>=20 >>> =3D=3D=3D=3D Orphaned products =3D=3D=3D=3D >>> PredictionIO has a solid and growing community. It is deployed on >>> production environments by companies of all sizes to run various = kinds > of >>> predictive engines. >>>=20 >>> In addition to the community contribution to PredictionIO framework, > the >>> community is also actively contributing new engines to the Template >>> Gallery as well as SDKs and documentation for the project. = Salesforce > is >>> committed to utilize and advance the PredictionIO code base and = support >>> its user community. >>>=20 >>> =3D=3D=3D=3D Inexperience with Open Source =3D=3D=3D=3D >>> PredictionIO has existed as a healthy open source project for almost > two >>> years and is the most starred Scala project on GitHub. All of the >> proposed >>> committers have contributed to ASF and Linux Foundation open source >>> projects. Several current committers on Apache projects and Apache >> Members >>> are involved in this proposal and intend to provide mentorship. >>>=20 >>> =3D=3D=3D=3D Homogeneous Developers =3D=3D=3D=3D >>> The initial list of committers includes developers from several >>> institutions, including Salesforce, ActionML, Channel4, USC as well = as >>> unaffiliated developers. >>>=20 >>> =3D=3D=3D=3D Reliance on Salaried Developers =3D=3D=3D=3D >>> Like most open source projects, PredictionIO receives substantial > support >>> from salaried developers. PredictionIO development is partially > supported >>> by Salesforce.com, but there are many contributors from various = other >>> companies, and an active mailing list composed of hundreds of users. = We >>> will continue our efforts to ensure stewardship of the project to be >>> independent of salaried developers by meritocratically promoting = those >>> contributors to committers. >>>=20 >>> =3D=3D=3D=3D Relationships with Other Apache Product =3D=3D=3D=3D >>> PredictionIO relies heavily on top level apache projects such as = Apache >>> Spark, HBase and Hadoop. However it brings a distinguished > functionality, >>> rather than just an abstraction - Machine Learning in a = plug-and-play >>> fashion. >>>=20 >>> Compared to Apache Mahout, which focuses on the development of a = wide >>> variety of algorithms, PredictionIO offers a platform to manage the > whole >>> machine learning workflow, including data collection, data = preparation, >>> modeling, deployment and management of predictive services in > production >>> environments. >>>=20 >>> =3D=3D=3D=3D An Excessive Fascination with the Apache Brand =3D=3D=3D=3D= >>> PredictionIO is already a widely known open source project. This > proposal >>> is not for the purpose of generating publicity. Rather, the primary >>> benefits to joining Apache are those outlined in the Rationale = section. >>>=20 >>> =3D=3D=3D Documentation =3D=3D=3D >>> PredictionIO boasts rich and live documentation, included in the = code >> repo >>> (docs/manual directory), is built with Middleman, and publicly = hosted > at >>> https://docs.prediction.io >>>=20 >>> =3D=3D=3D Initial Source and Intellectual Property Submission Plan = =3D=3D=3D >>> Currently, the PredictionIO codebase is distributed under the Apache > 2.0 >>> License and hosted on GitHub: >> https://github.com/PredictionIO/PredictionIO >>>=20 >>> =3D=3D=3D External Dependencies =3D=3D=3D >>> PredictionIO has the following external dependencies: >>> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are >>> needed) >>> * Apache Spark 1.3.0 for Hadoop 2.4 >>> * Java SE Development Kit 8 >>> * and one of the following sets: >>> =E2=80=8B =E2=80=8B >>> * PostgreSQL 9.1 >>>=20 >>> =E2=80=8B =E2=80=8B >>> or >>>=20 >>> =E2=80=8B =E2=80=8B >>> * MySQL 5.1 >>> =E2=80=8B =E2=80=8B >>> or >>>=20 >>> =E2=80=8B =E2=80=8B >>> * Apache HBase 0.98.6 >>>=20 >>> =E2=80=8B =E2=80=8B >>> * Elasticsearch 1.4.0 >>>=20 >>> Upon acceptance to the incubator, we would begin a thorough analysis = of >>> all transitive dependencies to verify this information and introduce >>> license checking into the build and release process by integrating = with >>> Apache RAT. >>>=20 >>> =3D=3D=3D Cryptography =3D=3D=3D >>> PredictionIO does not include cryptographic code. We utilize = standard >>> JCE and JSSE APIs provided by the Java Runtime Environment. >>>=20 >>> =3D=3D=3D Required Resources =3D=3D=3D >>> We request that following resources be created for the project to = use >>>=20 >>> =3D=3D=3D=3D Mailing lists =3D=3D=3D=3D >>>=20 >>> predictionio-private@incubator.apache.org (with moderated > subscriptions) >>>=20 >>> predictionio-dev >>>=20 >>> predictionio-user >>>=20 >>> predictionio-commits >>>=20 >>> We will migrate the existing PredictionIO mailing lists. >>>=20 >>> =3D=3D=3D=3D Git repository =3D=3D=3D=3D >>> The PredictionIO team would like to use Git for source control, due = to >> our >>> current use of GitHub. >>>=20 >>> git://git.apache.org/incubator-predictionio >>>=20 >>> =3D=3D=3D=3D Documentation =3D=3D=3D=3D >>> https://predictionio.incubator.apache.org/docs/ >>>=20 >>> =3D=3D=3D=3D JIRA instance =3D=3D=3D=3D >>> PredictionIO currently uses the GitHub issue tracking system = associated >>> with its repository: > https://github.com/PredictionIO/PredictionIO/issues >> . >>> We will migrate to Apache JIRA. >>>=20 >>> JIRA PREDICTIONIO >>> https://issues.apache.org/jira/browse/PREDICTIONIO >>>=20 >>> =3D=3D=3D=3D Other Resources =3D=3D=3D=3D >>> * TravisCI for builds and test running. >>>=20 >>> * PredictionIO's documentation, included in the code repo = (docs/manual >>> directory), is built with Middleman and publicly hosted >>> https://docs.prediction.io >>>=20 >>> * A blog to drive adoption and excitement at > https://blog.prediction.io >>>=20 >>> =3D=3D=3D Initial Committers =3D=3D=3D >>>=20 >>> * Pat Ferrell >>>=20 >>> * Tamas Jambor >>>=20 >>> * Justin Yip >>>=20 >>> * Xusen Yin >>>=20 >>> * Lee Moon Soo >>>=20 >>> * Donald Szeto >>>=20 >>> * Kenneth Chan >>>=20 >>> * Tom Chan >>>=20 >>> * Simon Chan >>>=20 >>> * Marco Vivero >>>=20 >>> * Matthew Tovbin >>>=20 >>> * Yevgeny Khodorkovsky >>>=20 >>> * Felipe Oliveira >>>=20 >>> * Vitaly Gordon >>>=20 >>> =3D=3D=3D Affiliations =3D=3D=3D >>>=20 >>> * Pat Ferrell - ActionML >>>=20 >>> * Tamas Jambor - Channel4 >>>=20 >>> * Justin Yip - independent >>>=20 >>> * Xusen Yin - USC >>>=20 >>> * Lee Moon Soo - NFLabs >>>=20 >>> * Donald Szeto - Salesforce >>>=20 >>> * Kenneth Chan - Salesforce >>>=20 >>> * Tom Chan - Salesforce >>>=20 >>> * Simon Chan - Salesforce >>>=20 >>> * Marco Vivero - Salesforce >>>=20 >>> * Matthew Tovbin - Salesforce >>>=20 >>> * Yevgeny Khodorkovsky - Salesforce >>>=20 >>> * Felipe Oliveira - Salesforce >>>=20 >>> * Vitaly Gordon - Salesforce >>>=20 >>> =3D=3D=3D Sponsors =3D=3D=3D >>>=20 >>> =3D=3D=3D=3D Champion =3D=3D=3D=3D >>>=20 >>> Andrew Purtell >>>=20 >>> =3D=3D=3D=3D Nominated Mentors =3D=3D=3D=3D >>>=20 >>> * Andrew Purtell >>>=20 >>> * James Taylor >>>=20 >>> * Lars Hofhansl >>>=20 >>> * Suneel Marthi >>>=20 >>> * Xiangrui Meng >>>=20 >>> * Luciano Resende >>>=20 >>> =3D=3D=3D=3D Sponsoring Entity =3D=3D=3D=3D >>>=20 >>> Apache Incubator PMC >>>=20 >>=20 >=20 >=20 >=20 > -- > Best regards, >=20 > - Andy >=20 > Problems worthy of attack prove their worth by hitting back. - Piet = Hein > (via Tom White) >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org