incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <smar...@apache.org>
Subject Re: [DISCUSS] PredictionIO incubation proposal
Date Fri, 20 May 2016 04:16:26 GMT
I definitely have concerns about too many folks becoming initial committers
and bringing their own corporate agendas to this project.

I suggest that first we vote PIO into incubator then bring in those less
experienced with the project. We have a good start with people who have
worked on the project from several orgs. Let us get organized first and
then bring in new people.

I sincerely feel that this is getting real murky with too many cooks with
their own agendas. The lesser external integration points to PIO the better
the project would evolve.

My 2 cents.


On Thu, May 19, 2016 at 9:03 PM, Andrew Purtell <apurtell@apache.org> wrote:

> Hi Nick,
>
> Unless there are any concerns or objections, I will add you and Mr.
> Dusenberry to the proposal as initial committers tomorrow.
>
> Everyone,
>
> As it seems that discussion has died down I plan to start a VOTE thread on
> this coming Monday.
>
> Thank you for the comment and attention thus far.
>
>
> On Tue, May 17, 2016 at 12:58 PM, Nick Pentreath <nick.pentreath@gmail.com
> >
> wrote:
>
> > Hi there
> >
> > I'm glad to see the proposal to incubate PredictionIO. In my previous
> life
> > as a startup co-founder, I kept a close eye on the project, and it would
> be
> > fantastic to see it become an Apache incubating project!
> >
> > The folks working on Apache Spark and Apache SystemML (incubating) here
> at
> > IBM are excited about the possibilities for integrating PredictionIO and
> > SystemML (Mike Dusenberry is a committer on that project), as well
> > as further improving Spark integration (I'm a PMC member on that
> project).
> >
> > Mike and I, together with Luciano (who is a mentor on this proposal)
> would
> > like to volunteer our services as initial committers, if that is
> agreeable.
> >
> > Kind regards
> > Nick
> > mlnick@apache.org
> >
> >
> >
> > >
> > > ---------- Forwarded message ----------
> > > From: Andrew Purtell <apurtell@apache.org>
> > > To: "general@incubator.apache.org" <general@incubator.apache.org>
> > > Cc:
> > > Date: Fri, 13 May 2016 13:41:38 -0700
> > > Subject: [DISCUSS] PredictionIO incubation proposal
> > > Greetings,
> > >
> > > It is my pleasure to
> > > ​ ​
> > > propose the PredictionIO project for incubation at the Apache Software
> > > Foundation.
> > > ​ ​
> > > PredictionIO is a
> > > ​ popular​
> > > open
> > > ​ ​
> > > source Machine Learning Server built on top of a state-of-the-art open
> > > source stack, including several Apache technologies, that
> > > ​ ​
> > > enables developers to manage and deploy production-ready predictive
> > > services for various kinds of machine learning tasks
> > > ​, with more than 400 production deployments around the world and a
> > growing
> > > contributor community. ​
> > >
> > >
> > > The text of the proposal is included below and is also available at
> > > https://wiki.apache.org/incubator/PredictionIO
> > >
> > > Best regards,
> > > Andrew Purtell
> > >
> > >
> > > = PredictionIO Proposal =
> > >
> > > === Abstract ===
> > > PredictionIO is an open source Machine Learning Server built on top of
> > > state-of-the-art open source stack, that enables developers to manage
> and
> > > deploy production-ready predictive services for various kinds of
> machine
> > > learning tasks.
> > >
> > > === Proposal ===
> > > The PredictionIO platform consists of the following components:
> > >
> > >  * PredictionIO framework - provides the machine learning stack for
> > >  building, evaluating and deploying engines with machine learning
> > >  algorithms. It uses Apache Spark for processing.
> > >
> > >  * Event Server - the machine learning analytics layer for unifying
> > events
> > >  from multiple platforms. It can use Apache HBase or any JDBC backends
> > >  as its data store.
> > >
> > > The PredictionIO community also maintains a
> > > ​ ​
> > > Template Gallery, a place to
> > > publish and download (free or proprietary) engine templates for
> different
> > > types of machine learning applications, and is a complemental part of
> the
> > > project. At this point we exclude the Template Gallery from the
> proposal,
> > > as it has a separate set of contributors and we’re not familiar with an
> > > Apache approved mechanism to maintain such a gallery.
> > >
> > > You can find the Template Gallery at https://templates.prediction.io/
> > >
> > > === Background ===
> > > PredictionIO was started with a mission to democratize and bring
> machine
> > > learning to the masses.
> > >
> > > Machine learning has traditionally been a luxury for big companies like
> > > Google, Facebook, and Netflix. There are ML libraries and tools lying
> > > around the internet but the effort of putting them all together as a
> > > production-ready infrastructure is a very resource-intensive task that
> is
> > > remotely reachable by individuals or small businesses.
> > >
> > > PredictionIO is a production-ready, full stack machine learning system
> > that
> > > allows organizations of any scale to quickly deploy machine learning
> > > capabilities. It comes with official and community-contributed machine
> > > learning engine templates that are easy to customize.
> > >
> > > === Rationale ===
> > > As usage and number of contributors to PredictionIO has grown bigger
> and
> > > more diverse, we have sought for an independent framework for the
> project
> > > to keep thriving. We believe the Apache foundation is a great fit.
> > Joining
> > > Apache would ensure that tried and true processes and procedures are in
> > > place for the growing number of organizations interested in
> contributing
> > > to PredictionIO. PredictionIO is also a good fit for the Apache
> > foundation.
> > > PredictionIO was built on top of several Apache projects (HBase, Spark,
> > > Hadoop). We are familiar with the Apache process and believe that the
> > > democratic and meritocratic nature of the foundation aligns with the
> > > project goals.
> > >
> > > === Initial Goals ===
> > > The initial milestones will be to move the existing codebase to Apache
> > and
> > > integrate with the Apache development process. Once this is
> accomplished,
> > > we plan for incremental development and releases that follow the Apache
> > > guidelines, as well as growing our developer and user communities.
> > >
> > > === Current Status ===
> > > PredictionIO has undergone nine minor releases and many patches.
> > > PredictionIO is being used in production by Salesforce.com as well as
> > many
> > > other organizations and apps. The PredictionIO codebase is currently
> > > hosted at GitHub, which will form the basis of the Apache git
> repository.
> > >
> > > ==== Meritocracy ====
> > > We plan to invest in supporting a meritocracy. We will discuss the
> > > requirements in an open forum. We intend to invite additional
> developers
> > > to participate. We will encourage and monitor community participation
> so
> > > that privileges can be extended to those that contribute.
> > >
> > > ==== Community ====
> > > Acceptance into the Apache foundation would bolster the already strong
> > > user and developer community around PredictionIO. That community
> includes
> > > many contributors from various other companies, and an active mailing
> > list
> > > composed of hundreds of users.
> > >
> > > ==== Core Developers ====
> > > The core developers of our project are listed in our contributors and
> > > initial PPMC below. Though many are employed at Salesforce.com, there
> are
> > > also engineers from ActionML, and independent developers.
> > >
> > > === Alignment ===
> > > The ASF is the natural choice to host the PredictionIO project as its
> > goal
> > > is democratizing Machine Learning by making it more easily accessible
> to
> > > every user/developer. PredictionIO is built on top of several top level
> > > Apache projects as outlined above.
> > >
> > > === Known Risks ===
> > >
> > > ==== Orphaned products ====
> > > PredictionIO has a solid and growing community. It is deployed on
> > > production environments by companies of all sizes to run various kinds
> of
> > > predictive engines.
> > >
> > > In addition to the community contribution to PredictionIO framework,
> the
> > > community is also actively contributing new engines to the Template
> > > Gallery as well as SDKs and documentation for the project. Salesforce
> is
> > > committed to utilize and advance the PredictionIO code base and support
> > > its user community.
> > >
> > > ==== Inexperience with Open Source ====
> > > PredictionIO has existed as a healthy open source project for almost
> two
> > > years and is the most starred Scala project on GitHub. All of the
> > proposed
> > > committers have contributed to ASF and Linux Foundation open source
> > > projects. Several current committers on Apache projects and Apache
> > Members
> > > are involved in this proposal and intend to provide mentorship.
> > >
> > > ==== Homogeneous Developers ====
> > > The initial list of committers includes developers from several
> > > institutions, including Salesforce, ActionML, Channel4, USC as well as
> > > unaffiliated developers.
> > >
> > > ==== Reliance on Salaried Developers ====
> > > Like most open source projects, PredictionIO receives substantial
> support
> > > from salaried developers. PredictionIO development is partially
> supported
> > > by Salesforce.com, but there are many contributors from various other
> > > companies, and an active mailing list composed of hundreds of users. We
> > > will continue our efforts to ensure stewardship of the project to be
> > > independent of salaried developers by meritocratically promoting those
> > > contributors to committers.
> > >
> > > ==== Relationships with Other Apache Product ====
> > > PredictionIO relies heavily on top level apache projects such as Apache
> > > Spark, HBase and Hadoop. However it brings a distinguished
> functionality,
> > > rather than just an abstraction - Machine Learning in a plug-and-play
> > > fashion.
> > >
> > > Compared to Apache Mahout, which focuses on the development of a wide
> > > variety of algorithms, PredictionIO offers a platform to manage the
> whole
> > > machine learning workflow, including data collection, data preparation,
> > > modeling, deployment and management of predictive services in
> production
> > > environments.
> > >
> > > ==== An Excessive Fascination with the Apache Brand ====
> > > PredictionIO is already a widely known open source project. This
> proposal
> > > is not for the purpose of generating publicity. Rather, the primary
> > > benefits to joining Apache are those outlined in the Rationale section.
> > >
> > > === Documentation ===
> > > PredictionIO boasts rich and live documentation, included in the code
> > repo
> > > (docs/manual directory), is built with Middleman, and publicly hosted
> at
> > > https://docs.prediction.io
> > >
> > > === Initial Source and Intellectual Property Submission Plan ===
> > > Currently, the PredictionIO codebase is distributed under the Apache
> 2.0
> > > License and hosted on GitHub:
> > https://github.com/PredictionIO/PredictionIO
> > >
> > > === External Dependencies ===
> > > PredictionIO has the following external dependencies:
> > >  * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > > needed)
> > >  * Apache Spark 1.3.0 for Hadoop 2.4
> > >  * Java SE Development Kit 8
> > >  * and one of the following sets:
> > > ​  ​
> > >    * PostgreSQL 9.1
> > >
> > > ​  ​
> > > or
> > >
> > > ​  ​
> > > * MySQL 5.1
> > > ​  ​
> > >  or
> > >
> > > ​  ​
> > >  * Apache HBase 0.98.6
> > >
> > > ​  ​
> > > * Elasticsearch 1.4.0
> > >
> > > Upon acceptance to the incubator, we would begin a thorough analysis of
> > > all transitive dependencies to verify this information and introduce
> > > license checking into the build and release process by integrating with
> > > Apache RAT.
> > >
> > > === Cryptography ===
> > > PredictionIO does not include cryptographic code. We utilize standard
> > > JCE and JSSE APIs provided by the Java Runtime Environment.
> > >
> > > === Required Resources ===
> > > We request that following resources be created for the project to use
> > >
> > > ==== Mailing lists ====
> > >
> > > predictionio-private@incubator.apache.org (with moderated
> subscriptions)
> > >
> > > predictionio-dev
> > >
> > > predictionio-user
> > >
> > > predictionio-commits
> > >
> > > We will migrate the existing PredictionIO mailing lists.
> > >
> > > ==== Git repository ====
> > > The PredictionIO team would like to use Git for source control, due to
> > our
> > > current use of GitHub.
> > >
> > > git://git.apache.org/incubator-predictionio
> > >
> > > ==== Documentation ====
> > > https://predictionio.incubator.apache.org/docs/
> > >
> > > ==== JIRA instance ====
> > > PredictionIO currently uses the GitHub issue tracking system associated
> > > with its repository:
> https://github.com/PredictionIO/PredictionIO/issues
> > .
> > > We will migrate to Apache JIRA.
> > >
> > > JIRA PREDICTIONIO
> > > https://issues.apache.org/jira/browse/PREDICTIONIO
> > >
> > > ==== Other Resources ====
> > > * TravisCI for builds and test running.
> > >
> > > * PredictionIO's documentation, included in the code repo (docs/manual
> > > directory), is built with Middleman and publicly hosted
> > > https://docs.prediction.io
> > >
> > > * A blog to drive adoption and excitement at
> https://blog.prediction.io
> > >
> > > === Initial Committers ===
> > >
> > > * Pat Ferrell
> > >
> > > * Tamas Jambor
> > >
> > > * Justin Yip
> > >
> > > * Xusen Yin
> > >
> > > * Lee Moon Soo
> > >
> > > * Donald Szeto
> > >
> > > * Kenneth Chan
> > >
> > > * Tom Chan
> > >
> > > * Simon Chan
> > >
> > > * Marco Vivero
> > >
> > > * Matthew Tovbin
> > >
> > > * Yevgeny Khodorkovsky
> > >
> > > * Felipe Oliveira
> > >
> > > * Vitaly Gordon
> > >
> > > === Affiliations ===
> > >
> > > * Pat Ferrell - ActionML
> > >
> > > * Tamas Jambor - Channel4
> > >
> > > * Justin Yip - independent
> > >
> > > * Xusen Yin - USC
> > >
> > > * Lee Moon Soo - NFLabs
> > >
> > > * Donald Szeto - Salesforce
> > >
> > > * Kenneth Chan - Salesforce
> > >
> > > * Tom Chan - Salesforce
> > >
> > > * Simon Chan - Salesforce
> > >
> > > * Marco Vivero - Salesforce
> > >
> > > * Matthew Tovbin - Salesforce
> > >
> > > * Yevgeny Khodorkovsky - Salesforce
> > >
> > > * Felipe Oliveira - Salesforce
> > >
> > > * Vitaly Gordon - Salesforce
> > >
> > > === Sponsors ===
> > >
> > > ==== Champion ====
> > >
> > > Andrew Purtell <apurtell at apache dot org>
> > >
> > > ==== Nominated Mentors ====
> > >
> > > * Andrew Purtell <apurtell at apache dot org>
> > >
> > > * James Taylor <jtaylor at apache dot org>
> > >
> > > * Lars Hofhansl <larsh at apache dot org>
> > >
> > > * Suneel Marthi <smarthi at apache dot org>
> > >
> > > * Xiangrui Meng <meng at apache dot org>
> > >
> > > * Luciano Resende <lresende at apache dot org>
> > >
> > > ==== Sponsoring Entity ====
> > >
> > > Apache Incubator PMC
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message