incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: [DISCUSS] PredictionIO incubation proposal
Date Tue, 17 May 2016 20:09:08 GMT
Cool, this will make code grant process to be easier =)

The initial committers and mentors look great.
I am sure more will come as contributions start pouring in to the project.

Looking forward for the VOTE thread soon.

- Henry

On Mon, May 16, 2016 at 12:07 PM, Simon Chan <simon@salesforce.com> wrote:

> Yes, it includes everyone who previously contributed code from PredictionIO
> before the acquisition and still want to be involved in the project.
>
> We may have missed "Alex Merritt", going to add him to the list soon.
>
> Simon
>
>
> On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <smarthi@apache.org>
> wrote:
>
> > I do have a question about the proposed list of committers.
> >
> > Does the list also include all of those folks who were with PredictionIO
> > (and had contributed to the project) and then chose to leave when PIO was
> > acquired by Salesforce?
> >
> >
> >
> >
> > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <jb@nanthrax.net>
> > wrote:
> >
> > > By the way, we have some discussion about integrating Zeppelin with
> Beam
> > ;)
> > >
> > > Regards
> > > JB
> > >
> > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > >
> > >> Super excited to see this proposal! This will finally allow us to have
> > >> an ASF managed
> > >> backend for next generation data-driven apps that I see emerging quite
> > >> rapidly.
> > >>
> > >> The proposal looks great to me (although I'd recommend calling Scala
> > >> as an implementation
> > >> language more prominently since it may attract additional developers
> > >> with affinity to it).
> > >>
> > >> I do have two questions about technology:
> > >>     1. do you think it would be possible to leverage Apache Beam
> > >> (incubating)
> > >>         for abstracting away dependency on execution frameworks? My
> > >> understanding
> > >>         is that PredictionIO currently only run on Spark.
> > >>     2. is there a potential integration with Apache Zeppelin possible?
> > >>
> > >> Thanks,
> > >> Roman.
> > >>
> > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <apurtell@apache.org>
> > >> wrote:
> > >>
> > >>> Greetings,
> > >>>
> > >>> It is my pleasure to
> > >>>
> > >>> propose the PredictionIO project for incubation at the Apache
> Software
> > >>> Foundation.
> > >>>
> > >>> PredictionIO is a
> > >>> popular
> > >>> open
> > >>>
> > >>> source Machine Learning Server built on top of a state-of-the-art
> open
> > >>> source stack, including several Apache technologies, that
> > >>>
> > >>> enables developers to manage and deploy production-ready predictive
> > >>> services for various kinds of machine learning tasks
> > >>> , with more than 400 production deployments around the world and a
> > >>> growing
> > >>> contributor community.
> > >>>
> > >>>
> > >>> The text of the proposal is included below and is also available at
> > >>> https://wiki.apache.org/incubator/PredictionIO
> > >>>
> > >>> Best regards,
> > >>> Andrew Purtell
> > >>>
> > >>>
> > >>> = PredictionIO Proposal =
> > >>>
> > >>> === Abstract ===
> > >>> PredictionIO is an open source Machine Learning Server built on top
> of
> > >>> state-of-the-art open source stack, that enables developers to manage
> > and
> > >>> deploy production-ready predictive services for various kinds of
> > machine
> > >>> learning tasks.
> > >>>
> > >>> === Proposal ===
> > >>> The PredictionIO platform consists of the following components:
> > >>>
> > >>>   * PredictionIO framework - provides the machine learning stack for
> > >>>   building, evaluating and deploying engines with machine learning
> > >>>   algorithms. It uses Apache Spark for processing.
> > >>>
> > >>>   * Event Server - the machine learning analytics layer for unifying
> > >>> events
> > >>>   from multiple platforms. It can use Apache HBase or any JDBC
> backends
> > >>>   as its data store.
> > >>>
> > >>> The PredictionIO community also maintains a
> > >>>
> > >>> Template Gallery, a place to
> > >>> publish and download (free or proprietary) engine templates for
> > different
> > >>> types of machine learning applications, and is a complemental part
of
> > the
> > >>> project. At this point we exclude the Template Gallery from the
> > proposal,
> > >>> as it has a separate set of contributors and we’re not familiar with
> an
> > >>> Apache approved mechanism to maintain such a gallery.
> > >>>
> > >>> You can find the Template Gallery at
> https://templates.prediction.io/
> > >>>
> > >>> === Background ===
> > >>> PredictionIO was started with a mission to democratize and bring
> > machine
> > >>> learning to the masses.
> > >>>
> > >>> Machine learning has traditionally been a luxury for big companies
> like
> > >>> Google, Facebook, and Netflix. There are ML libraries and tools lying
> > >>> around the internet but the effort of putting them all together as
a
> > >>> production-ready infrastructure is a very resource-intensive task
> that
> > is
> > >>> remotely reachable by individuals or small businesses.
> > >>>
> > >>> PredictionIO is a production-ready, full stack machine learning
> system
> > >>> that
> > >>> allows organizations of any scale to quickly deploy machine learning
> > >>> capabilities. It comes with official and community-contributed
> machine
> > >>> learning engine templates that are easy to customize.
> > >>>
> > >>> === Rationale ===
> > >>> As usage and number of contributors to PredictionIO has grown bigger
> > and
> > >>> more diverse, we have sought for an independent framework for the
> > project
> > >>> to keep thriving. We believe the Apache foundation is a great fit.
> > >>> Joining
> > >>> Apache would ensure that tried and true processes and procedures are
> in
> > >>> place for the growing number of organizations interested in
> > contributing
> > >>> to PredictionIO. PredictionIO is also a good fit for the Apache
> > >>> foundation.
> > >>> PredictionIO was built on top of several Apache projects (HBase,
> Spark,
> > >>> Hadoop). We are familiar with the Apache process and believe that the
> > >>> democratic and meritocratic nature of the foundation aligns with the
> > >>> project goals.
> > >>>
> > >>> === Initial Goals ===
> > >>> The initial milestones will be to move the existing codebase to
> Apache
> > >>> and
> > >>> integrate with the Apache development process. Once this is
> > accomplished,
> > >>> we plan for incremental development and releases that follow the
> Apache
> > >>> guidelines, as well as growing our developer and user communities.
> > >>>
> > >>> === Current Status ===
> > >>> PredictionIO has undergone nine minor releases and many patches.
> > >>> PredictionIO is being used in production by Salesforce.com as well
as
> > >>> many
> > >>> other organizations and apps. The PredictionIO codebase is currently
> > >>> hosted at GitHub, which will form the basis of the Apache git
> > repository.
> > >>>
> > >>> ==== Meritocracy ====
> > >>> We plan to invest in supporting a meritocracy. We will discuss the
> > >>> requirements in an open forum. We intend to invite additional
> > developers
> > >>> to participate. We will encourage and monitor community participation
> > so
> > >>> that privileges can be extended to those that contribute.
> > >>>
> > >>> ==== Community ====
> > >>> Acceptance into the Apache foundation would bolster the already
> strong
> > >>> user and developer community around PredictionIO. That community
> > includes
> > >>> many contributors from various other companies, and an active mailing
> > >>> list
> > >>> composed of hundreds of users.
> > >>>
> > >>> ==== Core Developers ====
> > >>> The core developers of our project are listed in our contributors and
> > >>> initial PPMC below. Though many are employed at Salesforce.com, there
> > are
> > >>> also engineers from ActionML, and independent developers.
> > >>>
> > >>> === Alignment ===
> > >>> The ASF is the natural choice to host the PredictionIO project as its
> > >>> goal
> > >>> is democratizing Machine Learning by making it more easily accessible
> > to
> > >>> every user/developer. PredictionIO is built on top of several top
> level
> > >>> Apache projects as outlined above.
> > >>>
> > >>> === Known Risks ===
> > >>>
> > >>> ==== Orphaned products ====
> > >>> PredictionIO has a solid and growing community. It is deployed on
> > >>> production environments by companies of all sizes to run various
> kinds
> > of
> > >>> predictive engines.
> > >>>
> > >>> In addition to the community contribution to PredictionIO framework,
> > the
> > >>> community is also actively contributing new engines to the Template
> > >>> Gallery as well as SDKs and documentation for the project. Salesforce
> > is
> > >>> committed to utilize and advance the PredictionIO code base and
> support
> > >>> its user community.
> > >>>
> > >>> ==== Inexperience with Open Source ====
> > >>> PredictionIO has existed as a healthy open source project for almost
> > two
> > >>> years and is the most starred Scala project on GitHub. All of the
> > >>> proposed
> > >>> committers have contributed to ASF and Linux Foundation open source
> > >>> projects. Several current committers on Apache projects and Apache
> > >>> Members
> > >>> are involved in this proposal and intend to provide mentorship.
> > >>>
> > >>> ==== Homogeneous Developers ====
> > >>> The initial list of committers includes developers from several
> > >>> institutions, including Salesforce, ActionML, Channel4, USC as well
> as
> > >>> unaffiliated developers.
> > >>>
> > >>> ==== Reliance on Salaried Developers ====
> > >>> Like most open source projects, PredictionIO receives substantial
> > support
> > >>> from salaried developers. PredictionIO development is partially
> > supported
> > >>> by Salesforce.com, but there are many contributors from various other
> > >>> companies, and an active mailing list composed of hundreds of users.
> We
> > >>> will continue our efforts to ensure stewardship of the project to be
> > >>> independent of salaried developers by meritocratically promoting
> those
> > >>> contributors to committers.
> > >>>
> > >>> ==== Relationships with Other Apache Product ====
> > >>> PredictionIO relies heavily on top level apache projects such as
> Apache
> > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > functionality,
> > >>> rather than just an abstraction - Machine Learning in a plug-and-play
> > >>> fashion.
> > >>>
> > >>> Compared to Apache Mahout, which focuses on the development of a wide
> > >>> variety of algorithms, PredictionIO offers a platform to manage the
> > whole
> > >>> machine learning workflow, including data collection, data
> preparation,
> > >>> modeling, deployment and management of predictive services in
> > production
> > >>> environments.
> > >>>
> > >>> ==== An Excessive Fascination with the Apache Brand ====
> > >>> PredictionIO is already a widely known open source project. This
> > proposal
> > >>> is not for the purpose of generating publicity. Rather, the primary
> > >>> benefits to joining Apache are those outlined in the Rationale
> section.
> > >>>
> > >>> === Documentation ===
> > >>> PredictionIO boasts rich and live documentation, included in the code
> > >>> repo
> > >>> (docs/manual directory), is built with Middleman, and publicly hosted
> > at
> > >>> https://docs.prediction.io
> > >>>
> > >>> === Initial Source and Intellectual Property Submission Plan ===
> > >>> Currently, the PredictionIO codebase is distributed under the Apache
> > 2.0
> > >>> License and hosted on GitHub:
> > >>> https://github.com/PredictionIO/PredictionIO
> > >>>
> > >>> === External Dependencies ===
> > >>> PredictionIO has the following external dependencies:
> > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are
> > >>> needed)
> > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > >>>   * Java SE Development Kit 8
> > >>>   * and one of the following sets:
> > >>>
> > >>>     * PostgreSQL 9.1
> > >>>
> > >>>
> > >>> or
> > >>>
> > >>>
> > >>> * MySQL 5.1
> > >>>
> > >>>   or
> > >>>
> > >>>
> > >>>   * Apache HBase 0.98.6
> > >>>
> > >>>
> > >>> * Elasticsearch 1.4.0
> > >>>
> > >>> Upon acceptance to the incubator, we would begin a thorough analysis
> of
> > >>> all transitive dependencies to verify this information and introduce
> > >>> license checking into the build and release process by integrating
> with
> > >>> Apache RAT.
> > >>>
> > >>> === Cryptography ===
> > >>> PredictionIO does not include cryptographic code. We utilize standard
> > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > >>>
> > >>> === Required Resources ===
> > >>> We request that following resources be created for the project to use
> > >>>
> > >>> ==== Mailing lists ====
> > >>>
> > >>> predictionio-private@incubator.apache.org (with moderated
> > subscriptions)
> > >>>
> > >>> predictionio-dev
> > >>>
> > >>> predictionio-user
> > >>>
> > >>> predictionio-commits
> > >>>
> > >>> We will migrate the existing PredictionIO mailing lists.
> > >>>
> > >>> ==== Git repository ====
> > >>> The PredictionIO team would like to use Git for source control, due
> to
> > >>> our
> > >>> current use of GitHub.
> > >>>
> > >>> git://git.apache.org/incubator-predictionio
> > >>>
> > >>> ==== Documentation ====
> > >>> https://predictionio.incubator.apache.org/docs/
> > >>>
> > >>> ==== JIRA instance ====
> > >>> PredictionIO currently uses the GitHub issue tracking system
> associated
> > >>> with its repository:
> > https://github.com/PredictionIO/PredictionIO/issues
> > >>> .
> > >>> We will migrate to Apache JIRA.
> > >>>
> > >>> JIRA PREDICTIONIO
> > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > >>>
> > >>> ==== Other Resources ====
> > >>> * TravisCI for builds and test running.
> > >>>
> > >>> * PredictionIO's documentation, included in the code repo
> (docs/manual
> > >>> directory), is built with Middleman and publicly hosted
> > >>> https://docs.prediction.io
> > >>>
> > >>> * A blog to drive adoption and excitement at
> > https://blog.prediction.io
> > >>>
> > >>> === Initial Committers ===
> > >>>
> > >>> * Pat Ferrell
> > >>>
> > >>> * Tamas Jambor
> > >>>
> > >>> * Justin Yip
> > >>>
> > >>> * Xusen Yin
> > >>>
> > >>> * Lee Moon Soo
> > >>>
> > >>> * Donald Szeto
> > >>>
> > >>> * Kenneth Chan
> > >>>
> > >>> * Tom Chan
> > >>>
> > >>> * Simon Chan
> > >>>
> > >>> * Marco Vivero
> > >>>
> > >>> * Matthew Tovbin
> > >>>
> > >>> * Yevgeny Khodorkovsky
> > >>>
> > >>> * Felipe Oliveira
> > >>>
> > >>> * Vitaly Gordon
> > >>>
> > >>> === Affiliations ===
> > >>>
> > >>> * Pat Ferrell - ActionML
> > >>>
> > >>> * Tamas Jambor - Channel4
> > >>>
> > >>> * Justin Yip - independent
> > >>>
> > >>> * Xusen Yin - USC
> > >>>
> > >>> * Lee Moon Soo - NFLabs
> > >>>
> > >>> * Donald Szeto - Salesforce
> > >>>
> > >>> * Kenneth Chan - Salesforce
> > >>>
> > >>> * Tom Chan - Salesforce
> > >>>
> > >>> * Simon Chan - Salesforce
> > >>>
> > >>> * Marco Vivero - Salesforce
> > >>>
> > >>> * Matthew Tovbin - Salesforce
> > >>>
> > >>> * Yevgeny Khodorkovsky - Salesforce
> > >>>
> > >>> * Felipe Oliveira - Salesforce
> > >>>
> > >>> * Vitaly Gordon - Salesforce
> > >>>
> > >>> === Sponsors ===
> > >>>
> > >>> ==== Champion ====
> > >>>
> > >>> Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> ==== Nominated Mentors ====
> > >>>
> > >>> * Andrew Purtell <apurtell at apache dot org>
> > >>>
> > >>> * James Taylor <jtaylor at apache dot org>
> > >>>
> > >>> * Lars Hofhansl <larsh at apache dot org>
> > >>>
> > >>> * Suneel Marthi <smarthi at apache dot org>
> > >>>
> > >>> * Xiangrui Meng <meng at apache dot org>
> > >>>
> > >>> * Luciano Resende <lresende at apache dot org>
> > >>>
> > >>> ==== Sponsoring Entity ====
> > >>>
> > >>> Apache Incubator PMC
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > >> For additional commands, e-mail: general-help@incubator.apache.org
> > >>
> > >>
> > > --
> > > Jean-Baptiste Onofré
> > > jbonofre@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > For additional commands, e-mail: general-help@incubator.apache.org
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message