incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <smar...@apache.org>
Subject Re: [DISCUSS] PredictionIO incubation proposal
Date Tue, 17 May 2016 21:14:31 GMT
Thanks Henry

On Tue, May 17, 2016 at 5:11 PM, Henry Saputra <henry.saputra@gmail.com>
wrote:

> As mentor, you will have karma to commit to the source repository.
>
> As you probably know, the initial committers and mentors will form the
> initial PPMCs for the podling.
> Hopefully for day to day operations you should not need to have distinction
> of committer vs mentors anymore.
>
> You do not have to be listed as committer for the proposal.
>
> - Henry
>
> On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <smarthi@apache.org> wrote:
>
> > Thanks for having me as a mentor for PIO.  I would like to be added to
> the
> > initial list of committers and am looking to actively participate in the
> > development too. I am not sure if my being a mentor automatically grants
> me
> > the 'commit' karma.
> >
> > Its already been suggested earlier in this thread by Roman and
> > Jean-Baptiste that the project needs to be decoupled from Spark and
> > integrated with Beam.  It would be good to reduce the reliance on
> > Spark-Submit from what I have seen of the project so far. But let's not
> > talk architecture and design here when the project's not in incubator
> yet.
> > :)
> >
> >
> >
> >
> > On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <henry.saputra@gmail.com>
> > wrote:
> >
> > > Cool, this will make code grant process to be easier =)
> > >
> > > The initial committers and mentors look great.
> > > I am sure more will come as contributions start pouring in to the
> > project.
> > >
> > > Looking forward for the VOTE thread soon.
> > >
> > > - Henry
> > >
> > > On Mon, May 16, 2016 at 12:07 PM, Simon Chan <simon@salesforce.com>
> > wrote:
> > >
> > > > Yes, it includes everyone who previously contributed code from
> > > PredictionIO
> > > > before the acquisition and still want to be involved in the project.
> > > >
> > > > We may have missed "Alex Merritt", going to add him to the list soon.
> > > >
> > > > Simon
> > > >
> > > >
> > > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <smarthi@apache.org>
> > > > wrote:
> > > >
> > > > > I do have a question about the proposed list of committers.
> > > > >
> > > > > Does the list also include all of those folks who were with
> > > PredictionIO
> > > > > (and had contributed to the project) and then chose to leave when
> PIO
> > > was
> > > > > acquired by Salesforce?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
> > jb@nanthrax.net
> > > >
> > > > > wrote:
> > > > >
> > > > > > By the way, we have some discussion about integrating Zeppelin
> with
> > > > Beam
> > > > > ;)
> > > > > >
> > > > > > Regards
> > > > > > JB
> > > > > >
> > > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > > > >
> > > > > >> Super excited to see this proposal! This will finally allow
us
> to
> > > have
> > > > > >> an ASF managed
> > > > > >> backend for next generation data-driven apps that I see
emerging
> > > quite
> > > > > >> rapidly.
> > > > > >>
> > > > > >> The proposal looks great to me (although I'd recommend calling
> > Scala
> > > > > >> as an implementation
> > > > > >> language more prominently since it may attract additional
> > developers
> > > > > >> with affinity to it).
> > > > > >>
> > > > > >> I do have two questions about technology:
> > > > > >>     1. do you think it would be possible to leverage Apache
Beam
> > > > > >> (incubating)
> > > > > >>         for abstracting away dependency on execution frameworks?
> > My
> > > > > >> understanding
> > > > > >>         is that PredictionIO currently only run on Spark.
> > > > > >>     2. is there a potential integration with Apache Zeppelin
> > > possible?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Roman.
> > > > > >>
> > > > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> > > apurtell@apache.org>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Greetings,
> > > > > >>>
> > > > > >>> It is my pleasure to
> > > > > >>>
> > > > > >>> propose the PredictionIO project for incubation at the
Apache
> > > > Software
> > > > > >>> Foundation.
> > > > > >>>
> > > > > >>> PredictionIO is a
> > > > > >>> popular
> > > > > >>> open
> > > > > >>>
> > > > > >>> source Machine Learning Server built on top of a
> state-of-the-art
> > > > open
> > > > > >>> source stack, including several Apache technologies,
that
> > > > > >>>
> > > > > >>> enables developers to manage and deploy production-ready
> > predictive
> > > > > >>> services for various kinds of machine learning tasks
> > > > > >>> , with more than 400 production deployments around the
world
> and
> > a
> > > > > >>> growing
> > > > > >>> contributor community.
> > > > > >>>
> > > > > >>>
> > > > > >>> The text of the proposal is included below and is also
> available
> > at
> > > > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > > > >>>
> > > > > >>> Best regards,
> > > > > >>> Andrew Purtell
> > > > > >>>
> > > > > >>>
> > > > > >>> = PredictionIO Proposal =
> > > > > >>>
> > > > > >>> === Abstract ===
> > > > > >>> PredictionIO is an open source Machine Learning Server
built on
> > top
> > > > of
> > > > > >>> state-of-the-art open source stack, that enables developers
to
> > > manage
> > > > > and
> > > > > >>> deploy production-ready predictive services for various
kinds
> of
> > > > > machine
> > > > > >>> learning tasks.
> > > > > >>>
> > > > > >>> === Proposal ===
> > > > > >>> The PredictionIO platform consists of the following
components:
> > > > > >>>
> > > > > >>>   * PredictionIO framework - provides the machine learning
> stack
> > > for
> > > > > >>>   building, evaluating and deploying engines with machine
> > learning
> > > > > >>>   algorithms. It uses Apache Spark for processing.
> > > > > >>>
> > > > > >>>   * Event Server - the machine learning analytics layer
for
> > > unifying
> > > > > >>> events
> > > > > >>>   from multiple platforms. It can use Apache HBase or
any JDBC
> > > > backends
> > > > > >>>   as its data store.
> > > > > >>>
> > > > > >>> The PredictionIO community also maintains a
> > > > > >>>
> > > > > >>> Template Gallery, a place to
> > > > > >>> publish and download (free or proprietary) engine templates
for
> > > > > different
> > > > > >>> types of machine learning applications, and is a complemental
> > part
> > > of
> > > > > the
> > > > > >>> project. At this point we exclude the Template Gallery
from the
> > > > > proposal,
> > > > > >>> as it has a separate set of contributors and we’re
not familiar
> > > with
> > > > an
> > > > > >>> Apache approved mechanism to maintain such a gallery.
> > > > > >>>
> > > > > >>> You can find the Template Gallery at
> > > > https://templates.prediction.io/
> > > > > >>>
> > > > > >>> === Background ===
> > > > > >>> PredictionIO was started with a mission to democratize
and
> bring
> > > > > machine
> > > > > >>> learning to the masses.
> > > > > >>>
> > > > > >>> Machine learning has traditionally been a luxury for
big
> > companies
> > > > like
> > > > > >>> Google, Facebook, and Netflix. There are ML libraries
and tools
> > > lying
> > > > > >>> around the internet but the effort of putting them all
together
> > as
> > > a
> > > > > >>> production-ready infrastructure is a very resource-intensive
> task
> > > > that
> > > > > is
> > > > > >>> remotely reachable by individuals or small businesses.
> > > > > >>>
> > > > > >>> PredictionIO is a production-ready, full stack machine
learning
> > > > system
> > > > > >>> that
> > > > > >>> allows organizations of any scale to quickly deploy
machine
> > > learning
> > > > > >>> capabilities. It comes with official and community-contributed
> > > > machine
> > > > > >>> learning engine templates that are easy to customize.
> > > > > >>>
> > > > > >>> === Rationale ===
> > > > > >>> As usage and number of contributors to PredictionIO
has grown
> > > bigger
> > > > > and
> > > > > >>> more diverse, we have sought for an independent framework
for
> the
> > > > > project
> > > > > >>> to keep thriving. We believe the Apache foundation is
a great
> > fit.
> > > > > >>> Joining
> > > > > >>> Apache would ensure that tried and true processes and
> procedures
> > > are
> > > > in
> > > > > >>> place for the growing number of organizations interested
in
> > > > > contributing
> > > > > >>> to PredictionIO. PredictionIO is also a good fit for
the Apache
> > > > > >>> foundation.
> > > > > >>> PredictionIO was built on top of several Apache projects
> (HBase,
> > > > Spark,
> > > > > >>> Hadoop). We are familiar with the Apache process and
believe
> that
> > > the
> > > > > >>> democratic and meritocratic nature of the foundation
aligns
> with
> > > the
> > > > > >>> project goals.
> > > > > >>>
> > > > > >>> === Initial Goals ===
> > > > > >>> The initial milestones will be to move the existing
codebase to
> > > > Apache
> > > > > >>> and
> > > > > >>> integrate with the Apache development process. Once
this is
> > > > > accomplished,
> > > > > >>> we plan for incremental development and releases that
follow
> the
> > > > Apache
> > > > > >>> guidelines, as well as growing our developer and user
> > communities.
> > > > > >>>
> > > > > >>> === Current Status ===
> > > > > >>> PredictionIO has undergone nine minor releases and many
> patches.
> > > > > >>> PredictionIO is being used in production by Salesforce.com
as
> > well
> > > as
> > > > > >>> many
> > > > > >>> other organizations and apps. The PredictionIO codebase
is
> > > currently
> > > > > >>> hosted at GitHub, which will form the basis of the Apache
git
> > > > > repository.
> > > > > >>>
> > > > > >>> ==== Meritocracy ====
> > > > > >>> We plan to invest in supporting a meritocracy. We will
discuss
> > the
> > > > > >>> requirements in an open forum. We intend to invite additional
> > > > > developers
> > > > > >>> to participate. We will encourage and monitor community
> > > participation
> > > > > so
> > > > > >>> that privileges can be extended to those that contribute.
> > > > > >>>
> > > > > >>> ==== Community ====
> > > > > >>> Acceptance into the Apache foundation would bolster
the already
> > > > strong
> > > > > >>> user and developer community around PredictionIO. That
> community
> > > > > includes
> > > > > >>> many contributors from various other companies, and
an active
> > > mailing
> > > > > >>> list
> > > > > >>> composed of hundreds of users.
> > > > > >>>
> > > > > >>> ==== Core Developers ====
> > > > > >>> The core developers of our project are listed in our
> contributors
> > > and
> > > > > >>> initial PPMC below. Though many are employed at Salesforce.com,
> > > there
> > > > > are
> > > > > >>> also engineers from ActionML, and independent developers.
> > > > > >>>
> > > > > >>> === Alignment ===
> > > > > >>> The ASF is the natural choice to host the PredictionIO
project
> as
> > > its
> > > > > >>> goal
> > > > > >>> is democratizing Machine Learning by making it more
easily
> > > accessible
> > > > > to
> > > > > >>> every user/developer. PredictionIO is built on top of
several
> top
> > > > level
> > > > > >>> Apache projects as outlined above.
> > > > > >>>
> > > > > >>> === Known Risks ===
> > > > > >>>
> > > > > >>> ==== Orphaned products ====
> > > > > >>> PredictionIO has a solid and growing community. It is
deployed
> on
> > > > > >>> production environments by companies of all sizes to
run
> various
> > > > kinds
> > > > > of
> > > > > >>> predictive engines.
> > > > > >>>
> > > > > >>> In addition to the community contribution to PredictionIO
> > > framework,
> > > > > the
> > > > > >>> community is also actively contributing new engines
to the
> > Template
> > > > > >>> Gallery as well as SDKs and documentation for the project.
> > > Salesforce
> > > > > is
> > > > > >>> committed to utilize and advance the PredictionIO code
base and
> > > > support
> > > > > >>> its user community.
> > > > > >>>
> > > > > >>> ==== Inexperience with Open Source ====
> > > > > >>> PredictionIO has existed as a healthy open source project
for
> > > almost
> > > > > two
> > > > > >>> years and is the most starred Scala project on GitHub.
All of
> the
> > > > > >>> proposed
> > > > > >>> committers have contributed to ASF and Linux Foundation
open
> > source
> > > > > >>> projects. Several current committers on Apache projects
and
> > Apache
> > > > > >>> Members
> > > > > >>> are involved in this proposal and intend to provide
mentorship.
> > > > > >>>
> > > > > >>> ==== Homogeneous Developers ====
> > > > > >>> The initial list of committers includes developers from
several
> > > > > >>> institutions, including Salesforce, ActionML, Channel4,
USC as
> > well
> > > > as
> > > > > >>> unaffiliated developers.
> > > > > >>>
> > > > > >>> ==== Reliance on Salaried Developers ====
> > > > > >>> Like most open source projects, PredictionIO receives
> substantial
> > > > > support
> > > > > >>> from salaried developers. PredictionIO development is
partially
> > > > > supported
> > > > > >>> by Salesforce.com, but there are many contributors from
various
> > > other
> > > > > >>> companies, and an active mailing list composed of hundreds
of
> > > users.
> > > > We
> > > > > >>> will continue our efforts to ensure stewardship of the
project
> to
> > > be
> > > > > >>> independent of salaried developers by meritocratically
> promoting
> > > > those
> > > > > >>> contributors to committers.
> > > > > >>>
> > > > > >>> ==== Relationships with Other Apache Product ====
> > > > > >>> PredictionIO relies heavily on top level apache projects
such
> as
> > > > Apache
> > > > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > > > functionality,
> > > > > >>> rather than just an abstraction - Machine Learning in
a
> > > plug-and-play
> > > > > >>> fashion.
> > > > > >>>
> > > > > >>> Compared to Apache Mahout, which focuses on the development
of
> a
> > > wide
> > > > > >>> variety of algorithms, PredictionIO offers a platform
to manage
> > the
> > > > > whole
> > > > > >>> machine learning workflow, including data collection,
data
> > > > preparation,
> > > > > >>> modeling, deployment and management of predictive services
in
> > > > > production
> > > > > >>> environments.
> > > > > >>>
> > > > > >>> ==== An Excessive Fascination with the Apache Brand
====
> > > > > >>> PredictionIO is already a widely known open source project.
> This
> > > > > proposal
> > > > > >>> is not for the purpose of generating publicity. Rather,
the
> > primary
> > > > > >>> benefits to joining Apache are those outlined in the
Rationale
> > > > section.
> > > > > >>>
> > > > > >>> === Documentation ===
> > > > > >>> PredictionIO boasts rich and live documentation, included
in
> the
> > > code
> > > > > >>> repo
> > > > > >>> (docs/manual directory), is built with Middleman, and
publicly
> > > hosted
> > > > > at
> > > > > >>> https://docs.prediction.io
> > > > > >>>
> > > > > >>> === Initial Source and Intellectual Property Submission
Plan
> ===
> > > > > >>> Currently, the PredictionIO codebase is distributed
under the
> > > Apache
> > > > > 2.0
> > > > > >>> License and hosted on GitHub:
> > > > > >>> https://github.com/PredictionIO/PredictionIO
> > > > > >>>
> > > > > >>> === External Dependencies ===
> > > > > >>> PredictionIO has the following external dependencies:
> > > > > >>>   * Apache Hadoop 2.4.0 (optional, required only if
YARN and
> HDFS
> > > are
> > > > > >>> needed)
> > > > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > > > >>>   * Java SE Development Kit 8
> > > > > >>>   * and one of the following sets:
> > > > > >>>
> > > > > >>>     * PostgreSQL 9.1
> > > > > >>>
> > > > > >>>
> > > > > >>> or
> > > > > >>>
> > > > > >>>
> > > > > >>> * MySQL 5.1
> > > > > >>>
> > > > > >>>   or
> > > > > >>>
> > > > > >>>
> > > > > >>>   * Apache HBase 0.98.6
> > > > > >>>
> > > > > >>>
> > > > > >>> * Elasticsearch 1.4.0
> > > > > >>>
> > > > > >>> Upon acceptance to the incubator, we would begin a thorough
> > > analysis
> > > > of
> > > > > >>> all transitive dependencies to verify this information
and
> > > introduce
> > > > > >>> license checking into the build and release process
by
> > integrating
> > > > with
> > > > > >>> Apache RAT.
> > > > > >>>
> > > > > >>> === Cryptography ===
> > > > > >>> PredictionIO does not include cryptographic code. We
utilize
> > > standard
> > > > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > > > >>>
> > > > > >>> === Required Resources ===
> > > > > >>> We request that following resources be created for the
project
> to
> > > use
> > > > > >>>
> > > > > >>> ==== Mailing lists ====
> > > > > >>>
> > > > > >>> predictionio-private@incubator.apache.org (with moderated
> > > > > subscriptions)
> > > > > >>>
> > > > > >>> predictionio-dev
> > > > > >>>
> > > > > >>> predictionio-user
> > > > > >>>
> > > > > >>> predictionio-commits
> > > > > >>>
> > > > > >>> We will migrate the existing PredictionIO mailing lists.
> > > > > >>>
> > > > > >>> ==== Git repository ====
> > > > > >>> The PredictionIO team would like to use Git for source
control,
> > due
> > > > to
> > > > > >>> our
> > > > > >>> current use of GitHub.
> > > > > >>>
> > > > > >>> git://git.apache.org/incubator-predictionio
> > > > > >>>
> > > > > >>> ==== Documentation ====
> > > > > >>> https://predictionio.incubator.apache.org/docs/
> > > > > >>>
> > > > > >>> ==== JIRA instance ====
> > > > > >>> PredictionIO currently uses the GitHub issue tracking
system
> > > > associated
> > > > > >>> with its repository:
> > > > > https://github.com/PredictionIO/PredictionIO/issues
> > > > > >>> .
> > > > > >>> We will migrate to Apache JIRA.
> > > > > >>>
> > > > > >>> JIRA PREDICTIONIO
> > > > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > > > >>>
> > > > > >>> ==== Other Resources ====
> > > > > >>> * TravisCI for builds and test running.
> > > > > >>>
> > > > > >>> * PredictionIO's documentation, included in the code
repo
> > > > (docs/manual
> > > > > >>> directory), is built with Middleman and publicly hosted
> > > > > >>> https://docs.prediction.io
> > > > > >>>
> > > > > >>> * A blog to drive adoption and excitement at
> > > > > https://blog.prediction.io
> > > > > >>>
> > > > > >>> === Initial Committers ===
> > > > > >>>
> > > > > >>> * Pat Ferrell
> > > > > >>>
> > > > > >>> * Tamas Jambor
> > > > > >>>
> > > > > >>> * Justin Yip
> > > > > >>>
> > > > > >>> * Xusen Yin
> > > > > >>>
> > > > > >>> * Lee Moon Soo
> > > > > >>>
> > > > > >>> * Donald Szeto
> > > > > >>>
> > > > > >>> * Kenneth Chan
> > > > > >>>
> > > > > >>> * Tom Chan
> > > > > >>>
> > > > > >>> * Simon Chan
> > > > > >>>
> > > > > >>> * Marco Vivero
> > > > > >>>
> > > > > >>> * Matthew Tovbin
> > > > > >>>
> > > > > >>> * Yevgeny Khodorkovsky
> > > > > >>>
> > > > > >>> * Felipe Oliveira
> > > > > >>>
> > > > > >>> * Vitaly Gordon
> > > > > >>>
> > > > > >>> === Affiliations ===
> > > > > >>>
> > > > > >>> * Pat Ferrell - ActionML
> > > > > >>>
> > > > > >>> * Tamas Jambor - Channel4
> > > > > >>>
> > > > > >>> * Justin Yip - independent
> > > > > >>>
> > > > > >>> * Xusen Yin - USC
> > > > > >>>
> > > > > >>> * Lee Moon Soo - NFLabs
> > > > > >>>
> > > > > >>> * Donald Szeto - Salesforce
> > > > > >>>
> > > > > >>> * Kenneth Chan - Salesforce
> > > > > >>>
> > > > > >>> * Tom Chan - Salesforce
> > > > > >>>
> > > > > >>> * Simon Chan - Salesforce
> > > > > >>>
> > > > > >>> * Marco Vivero - Salesforce
> > > > > >>>
> > > > > >>> * Matthew Tovbin - Salesforce
> > > > > >>>
> > > > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > > > >>>
> > > > > >>> * Felipe Oliveira - Salesforce
> > > > > >>>
> > > > > >>> * Vitaly Gordon - Salesforce
> > > > > >>>
> > > > > >>> === Sponsors ===
> > > > > >>>
> > > > > >>> ==== Champion ====
> > > > > >>>
> > > > > >>> Andrew Purtell <apurtell at apache dot org>
> > > > > >>>
> > > > > >>> ==== Nominated Mentors ====
> > > > > >>>
> > > > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > > > >>>
> > > > > >>> * James Taylor <jtaylor at apache dot org>
> > > > > >>>
> > > > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > > > >>>
> > > > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > > > >>>
> > > > > >>> * Xiangrui Meng <meng at apache dot org>
> > > > > >>>
> > > > > >>> * Luciano Resende <lresende at apache dot org>
> > > > > >>>
> > > > > >>> ==== Sponsoring Entity ====
> > > > > >>>
> > > > > >>> Apache Incubator PMC
> > > > > >>>
> > > > > >>
> > > > > >>
> > > ---------------------------------------------------------------------
> > > > > >> To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> > > > > >> For additional commands, e-mail:
> > general-help@incubator.apache.org
> > > > > >>
> > > > > >>
> > > > > > --
> > > > > > Jean-Baptiste Onofré
> > > > > > jbonofre@apache.org
> > > > > > http://blog.nanthrax.net
> > > > > > Talend - http://www.talend.com
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > > > For additional commands, e-mail:
> general-help@incubator.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message