incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gangumalla, Uma" <>
Subject Re: [VOTE] Accept PredictionIO into the Apache Incubator
Date Tue, 24 May 2016 06:49:17 GMT
+1 (binding)


On 5/23/16, 3:22 PM, "Andrew Purtell" <> wrote:

>Since discussion on the matter of PredictionIO has died down, I would like
>to call a VOTE
>on accepting PredictionIO into the Apache Incubator.
>​[ ] +1 Accept PredictionIO into the Apache Incubator
>[ ] +0 Abstain
>[ ] -1 Do not accept PredictionIO into the Apache Incubator, because ...
>This vote will be open for at least 72 hours.
>My vote is +1 (binding)
>PredictionIO Proposal
>PredictionIO is an open source Machine Learning Server built on top of
>state-of-the-art open source stack, that enables developers to manage and
>deploy production-ready predictive services for various kinds of machine
>learning tasks.
>The PredictionIO platform consists of the following components:
>   * PredictionIO framework - provides the machine learning stack for
>     building, evaluating and deploying engines with machine learning
>     algorithms. It uses Apache Spark for processing.
>   * Event Server - the machine learning analytics layer for unifying
>     from multiple platforms. It can use Apache HBase or any JDBC backends
>     as its data store.
>The PredictionIO community also maintains a Template Gallery, a place to
>publish and download (free or proprietary) engine templates for different
>types of machine learning applications, and is a complemental part of the
>project. At this point we exclude the Template Gallery from the proposal,
>as it has a separate set of contributors and we’re not familiar with an
>Apache approved mechanism to maintain such a gallery.
>PredictionIO was started with a mission to democratize and bring machine
>learning to the masses.
>Machine learning has traditionally been a luxury for big companies like
>Google, Facebook, and Netflix. There are ML libraries and tools lying
>around the internet but the effort of putting them all together as a
>production-ready infrastructure is a very resource-intensive task that is
>remotely reachable by individuals or small businesses.
>PredictionIO is a production-ready, full stack machine learning system
>allows organizations of any scale to quickly deploy machine learning
>capabilities. It comes with official and community-contributed machine
>learning engine templates that are easy to customize.
>As usage and number of contributors to PredictionIO has grown bigger and
>more diverse, we have sought for an independent framework for the project
>to keep thriving. We believe the Apache foundation is a great fit. Joining
>Apache would ensure that tried and true processes and procedures are in
>place for the growing number of organizations interested in contributing
>to PredictionIO. PredictionIO is also a good fit for the Apache
>PredictionIO was built on top of several Apache projects (HBase, Spark,
>Hadoop). We are familiar with the Apache process and believe that the
>democratic and meritocratic nature of the foundation aligns with the
>project goals.
>Initial Goals
>The initial milestones will be to move the existing codebase to Apache and
>integrate with the Apache development process. Once this is accomplished,
>we plan for incremental development and releases that follow the Apache
>guidelines, as well as growing our developer and user communities.
>Current Status
>PredictionIO has undergone nine minor releases and many patches.
>PredictionIO is being used in production by as well as many
>other organizations and apps. The PredictionIO codebase is currently
>hosted at GitHub, which will form the basis of the Apache git repository.
>We plan to invest in supporting a meritocracy. We will discuss the
>requirements in an open forum. We intend to invite additional developers
>to participate. We will encourage and monitor community participation so
>that privileges can be extended to those that contribute.
>Acceptance into the Apache foundation would bolster the already strong
>user and developer community around PredictionIO. That community includes
>many contributors from various other companies, and an active mailing list
>composed of hundreds of users.
>Core Developers
>The core developers of our project are listed in our contributors and
>initial PPMC below. Though many are employed at, there are
>also engineers from ActionML, and independent developers.
>The ASF is the natural choice to host the PredictionIO project as its goal
>is democratizing Machine Learning by making it more easily accessible to
>every user/developer. PredictionIO is built on top of several top level
>Apache projects as outlined above.
>Known Risks
>Orphaned Products
>PredictionIO has a solid and growing community. It is deployed on
>production environments by companies of all sizes to run various kinds of
>predictive engines.
>In addition to the community contribution to PredictionIO framework, the
>community is also actively contributing new engines to the Template
>Gallery as well as SDKs and documentation for the project. Salesforce is
>committed to utilize and advance the PredictionIO code base and support
>its user community.
>Inexperience with Open Source
>PredictionIO has existed as a healthy open source project for almost two
>years and is the most starred Scala project on GitHub. All of the proposed
>committers have contributed to ASF and Linux Foundation open source
>projects. Several current committers on Apache projects and Apache Members
>are involved in this proposal and intend to provide mentorship.
>Homogeneous Developers
>The initial list of committers includes developers from several
>institutions, including Salesforce, ActionML, Channel4, USC as well as
>unaffiliated developers.
>Reliance on Salaried Developers
>Like most open source projects, PredictionIO receives substantial support
>from salaried developers. PredictionIO development is partially supported
>by, but there are many contributors from various other
>companies, and an active mailing list composed of hundreds of users. We
>will continue our efforts to ensure stewardship of the project to be
>independent of salaried developers by meritocratically promoting those
>contributors to committers.
>Relationships with Other Apache Product
>PredictionIO relies heavily on top level Apache projects such as Apache
>Spark, HBase and Hadoop. However it brings a distinguished functionality,
>rather than just an abstraction - Machine Learning in a plug-and-play
>Compared to Apache Mahout, which focuses on the development of a wide
>variety of algorithms, PredictionIO offers a platform to manage the whole
>machine learning workflow, including data collection, data preparation,
>modeling, deployment and management of predictive services in production
>An Excessive Fascination with the Apache Brand
>PredictionIO is already a widely known open source project. This proposal
>is not for the purpose of generating publicity. Rather, the primary
>benefits to joining Apache are those outlined in the Rationale section.
>PredictionIO boasts rich and live documentation, included in the code repo
>(docs/manual directory), is built with Middleman, and publicly hosted at
>Initial Source and Intellectual Property Submission Plan
>Currently, the PredictionIO codebase is distributed under the Apache 2.0
>License and hosted on GitHub:
>External Dependencies
>PredictionIO has the following external dependencies:
> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> * Apache Spark 1.3.0 for Hadoop 2.4
> * Java SE Development Kit 8
> * and one of the following sets:
>   * PostgreSQL 9.1
> or
>   * MySQL 5.1
> or
>   * Apache HBase 0.98.6
>   * Elasticsearch 1.4.0
>Upon acceptance to the incubator, we would begin a thorough analysis of
>all transitive dependencies to verify this information and introduce
>license checking into the build and release process by integrating with
>Apache RAT.
>PredictionIO does not include cryptographic code. We utilize standard
>JCE and JSSE APIs provided by the Java Runtime Environment.
>Required Resources
>We request that following resources be created for the project to use
>Mailing lists
> (with moderated subscriptions)
>  predictionio-dev
>  predictionio-user
>  predictionio-commits
>  We will migrate the existing PredictionIO mailing lists.
>Git repository
>  The PredictionIO team would like to use Git for source control, due to
>  current use of GitHub.
>  git://
>JIRA instance
>  PredictionIO currently uses the GitHub issue tracking system associated
>  with its repository:
>  We will migrate to Apache JIRA.
>Other Resources
>  TravisCI for builds and test running.
>  PredictionIO's documentation, included in the code repo (docs/manual
>  directory), is built with Middleman and publicly hosted at
>  A blog to drive adoption and excitement at
>Initial Committers
>  Pat Ferrell
>  Tamas Jambor
>  Justin Yip
>  Xusen Yin
>  Lee Moon Soo
>  Donald Szeto
>  Kenneth Chan
>  Tom Chan
>  Simon Chan
>  Marco Vivero
>  Matthew Tovbin
>  Yevgeny Khodorkovsky
>  Felipe Oliveira
>  Vitaly Gordon
>  Alex Merritt
>  Pat Ferrell - ActionML
>  Tamas Jambor - Channel4
>  Justin Yip - independent
>  Xusen Yin - USC
>  Lee Moon Soo - NFLabs
>  Donald Szeto - Salesforce
>  Kenneth Chan - Salesforce
>  Tom Chan - Salesforce
>  Simon Chan - Salesforce
>  Marco Vivero - Salesforce
>  Matthew Tovbin - Salesforce
>  Yevgeny Khodorkovsky - Salesforce
>  Felipe Oliveira - Salesforce
>  Vitaly Gordon - Salesforce
>  Alex Merritt - ActionML
>  Andrew Purtell <apurtell at apache dot org>
>Nominated Mentors
>  Andrew Purtell <apurtell at apache dot org>
>  James Taylor <jtaylor at apache dot org>
>  Lars Hofhansl <larsh at apache dot org>
>  Suneel Marthi <smarthi at apache dot org>
>  Xiangrui Meng <meng at apache dot org>
>  Luciano Resende <lresende at apache dot org>
>Sponsoring Entity
>  Apache Incubator PMC
>Best regards,
>   - Andy
>Problems worthy of attack prove their worth by hitting back. - Piet Hein
>(via Tom White)

To unsubscribe, e-mail:
For additional commands, e-mail:
View raw message