incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: [DISCUSS] PredictionIO incubation proposal
Date Tue, 17 May 2016 21:11:14 GMT
As mentor, you will have karma to commit to the source repository.

As you probably know, the initial committers and mentors will form the
initial PPMCs for the podling.
Hopefully for day to day operations you should not need to have distinction
of committer vs mentors anymore.

You do not have to be listed as committer for the proposal.

- Henry

On Tue, May 17, 2016 at 1:57 PM, Suneel Marthi <smarthi@apache.org> wrote:

> Thanks for having me as a mentor for PIO.  I would like to be added to the
> initial list of committers and am looking to actively participate in the
> development too. I am not sure if my being a mentor automatically grants me
> the 'commit' karma.
>
> Its already been suggested earlier in this thread by Roman and
> Jean-Baptiste that the project needs to be decoupled from Spark and
> integrated with Beam.  It would be good to reduce the reliance on
> Spark-Submit from what I have seen of the project so far. But let's not
> talk architecture and design here when the project's not in incubator yet.
> :)
>
>
>
>
> On Tue, May 17, 2016 at 4:09 PM, Henry Saputra <henry.saputra@gmail.com>
> wrote:
>
> > Cool, this will make code grant process to be easier =)
> >
> > The initial committers and mentors look great.
> > I am sure more will come as contributions start pouring in to the
> project.
> >
> > Looking forward for the VOTE thread soon.
> >
> > - Henry
> >
> > On Mon, May 16, 2016 at 12:07 PM, Simon Chan <simon@salesforce.com>
> wrote:
> >
> > > Yes, it includes everyone who previously contributed code from
> > PredictionIO
> > > before the acquisition and still want to be involved in the project.
> > >
> > > We may have missed "Alex Merritt", going to add him to the list soon.
> > >
> > > Simon
> > >
> > >
> > > On Mon, May 16, 2016 at 11:58 AM, Suneel Marthi <smarthi@apache.org>
> > > wrote:
> > >
> > > > I do have a question about the proposed list of committers.
> > > >
> > > > Does the list also include all of those folks who were with
> > PredictionIO
> > > > (and had contributed to the project) and then chose to leave when PIO
> > was
> > > > acquired by Salesforce?
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, May 16, 2016 at 1:13 PM, Jean-Baptiste Onofré <
> jb@nanthrax.net
> > >
> > > > wrote:
> > > >
> > > > > By the way, we have some discussion about integrating Zeppelin with
> > > Beam
> > > > ;)
> > > > >
> > > > > Regards
> > > > > JB
> > > > >
> > > > > On 05/15/2016 02:32 AM, Roman Shaposhnik wrote:
> > > > >
> > > > >> Super excited to see this proposal! This will finally allow us
to
> > have
> > > > >> an ASF managed
> > > > >> backend for next generation data-driven apps that I see emerging
> > quite
> > > > >> rapidly.
> > > > >>
> > > > >> The proposal looks great to me (although I'd recommend calling
> Scala
> > > > >> as an implementation
> > > > >> language more prominently since it may attract additional
> developers
> > > > >> with affinity to it).
> > > > >>
> > > > >> I do have two questions about technology:
> > > > >>     1. do you think it would be possible to leverage Apache Beam
> > > > >> (incubating)
> > > > >>         for abstracting away dependency on execution frameworks?
> My
> > > > >> understanding
> > > > >>         is that PredictionIO currently only run on Spark.
> > > > >>     2. is there a potential integration with Apache Zeppelin
> > possible?
> > > > >>
> > > > >> Thanks,
> > > > >> Roman.
> > > > >>
> > > > >> On Fri, May 13, 2016 at 1:41 PM, Andrew Purtell <
> > apurtell@apache.org>
> > > > >> wrote:
> > > > >>
> > > > >>> Greetings,
> > > > >>>
> > > > >>> It is my pleasure to
> > > > >>>
> > > > >>> propose the PredictionIO project for incubation at the Apache
> > > Software
> > > > >>> Foundation.
> > > > >>>
> > > > >>> PredictionIO is a
> > > > >>> popular
> > > > >>> open
> > > > >>>
> > > > >>> source Machine Learning Server built on top of a state-of-the-art
> > > open
> > > > >>> source stack, including several Apache technologies, that
> > > > >>>
> > > > >>> enables developers to manage and deploy production-ready
> predictive
> > > > >>> services for various kinds of machine learning tasks
> > > > >>> , with more than 400 production deployments around the world
and
> a
> > > > >>> growing
> > > > >>> contributor community.
> > > > >>>
> > > > >>>
> > > > >>> The text of the proposal is included below and is also available
> at
> > > > >>> https://wiki.apache.org/incubator/PredictionIO
> > > > >>>
> > > > >>> Best regards,
> > > > >>> Andrew Purtell
> > > > >>>
> > > > >>>
> > > > >>> = PredictionIO Proposal =
> > > > >>>
> > > > >>> === Abstract ===
> > > > >>> PredictionIO is an open source Machine Learning Server built
on
> top
> > > of
> > > > >>> state-of-the-art open source stack, that enables developers
to
> > manage
> > > > and
> > > > >>> deploy production-ready predictive services for various kinds
of
> > > > machine
> > > > >>> learning tasks.
> > > > >>>
> > > > >>> === Proposal ===
> > > > >>> The PredictionIO platform consists of the following components:
> > > > >>>
> > > > >>>   * PredictionIO framework - provides the machine learning
stack
> > for
> > > > >>>   building, evaluating and deploying engines with machine
> learning
> > > > >>>   algorithms. It uses Apache Spark for processing.
> > > > >>>
> > > > >>>   * Event Server - the machine learning analytics layer for
> > unifying
> > > > >>> events
> > > > >>>   from multiple platforms. It can use Apache HBase or any
JDBC
> > > backends
> > > > >>>   as its data store.
> > > > >>>
> > > > >>> The PredictionIO community also maintains a
> > > > >>>
> > > > >>> Template Gallery, a place to
> > > > >>> publish and download (free or proprietary) engine templates
for
> > > > different
> > > > >>> types of machine learning applications, and is a complemental
> part
> > of
> > > > the
> > > > >>> project. At this point we exclude the Template Gallery from
the
> > > > proposal,
> > > > >>> as it has a separate set of contributors and we’re not
familiar
> > with
> > > an
> > > > >>> Apache approved mechanism to maintain such a gallery.
> > > > >>>
> > > > >>> You can find the Template Gallery at
> > > https://templates.prediction.io/
> > > > >>>
> > > > >>> === Background ===
> > > > >>> PredictionIO was started with a mission to democratize and
bring
> > > > machine
> > > > >>> learning to the masses.
> > > > >>>
> > > > >>> Machine learning has traditionally been a luxury for big
> companies
> > > like
> > > > >>> Google, Facebook, and Netflix. There are ML libraries and
tools
> > lying
> > > > >>> around the internet but the effort of putting them all together
> as
> > a
> > > > >>> production-ready infrastructure is a very resource-intensive
task
> > > that
> > > > is
> > > > >>> remotely reachable by individuals or small businesses.
> > > > >>>
> > > > >>> PredictionIO is a production-ready, full stack machine learning
> > > system
> > > > >>> that
> > > > >>> allows organizations of any scale to quickly deploy machine
> > learning
> > > > >>> capabilities. It comes with official and community-contributed
> > > machine
> > > > >>> learning engine templates that are easy to customize.
> > > > >>>
> > > > >>> === Rationale ===
> > > > >>> As usage and number of contributors to PredictionIO has grown
> > bigger
> > > > and
> > > > >>> more diverse, we have sought for an independent framework
for the
> > > > project
> > > > >>> to keep thriving. We believe the Apache foundation is a great
> fit.
> > > > >>> Joining
> > > > >>> Apache would ensure that tried and true processes and procedures
> > are
> > > in
> > > > >>> place for the growing number of organizations interested
in
> > > > contributing
> > > > >>> to PredictionIO. PredictionIO is also a good fit for the
Apache
> > > > >>> foundation.
> > > > >>> PredictionIO was built on top of several Apache projects
(HBase,
> > > Spark,
> > > > >>> Hadoop). We are familiar with the Apache process and believe
that
> > the
> > > > >>> democratic and meritocratic nature of the foundation aligns
with
> > the
> > > > >>> project goals.
> > > > >>>
> > > > >>> === Initial Goals ===
> > > > >>> The initial milestones will be to move the existing codebase
to
> > > Apache
> > > > >>> and
> > > > >>> integrate with the Apache development process. Once this
is
> > > > accomplished,
> > > > >>> we plan for incremental development and releases that follow
the
> > > Apache
> > > > >>> guidelines, as well as growing our developer and user
> communities.
> > > > >>>
> > > > >>> === Current Status ===
> > > > >>> PredictionIO has undergone nine minor releases and many patches.
> > > > >>> PredictionIO is being used in production by Salesforce.com
as
> well
> > as
> > > > >>> many
> > > > >>> other organizations and apps. The PredictionIO codebase is
> > currently
> > > > >>> hosted at GitHub, which will form the basis of the Apache
git
> > > > repository.
> > > > >>>
> > > > >>> ==== Meritocracy ====
> > > > >>> We plan to invest in supporting a meritocracy. We will discuss
> the
> > > > >>> requirements in an open forum. We intend to invite additional
> > > > developers
> > > > >>> to participate. We will encourage and monitor community
> > participation
> > > > so
> > > > >>> that privileges can be extended to those that contribute.
> > > > >>>
> > > > >>> ==== Community ====
> > > > >>> Acceptance into the Apache foundation would bolster the already
> > > strong
> > > > >>> user and developer community around PredictionIO. That community
> > > > includes
> > > > >>> many contributors from various other companies, and an active
> > mailing
> > > > >>> list
> > > > >>> composed of hundreds of users.
> > > > >>>
> > > > >>> ==== Core Developers ====
> > > > >>> The core developers of our project are listed in our contributors
> > and
> > > > >>> initial PPMC below. Though many are employed at Salesforce.com,
> > there
> > > > are
> > > > >>> also engineers from ActionML, and independent developers.
> > > > >>>
> > > > >>> === Alignment ===
> > > > >>> The ASF is the natural choice to host the PredictionIO project
as
> > its
> > > > >>> goal
> > > > >>> is democratizing Machine Learning by making it more easily
> > accessible
> > > > to
> > > > >>> every user/developer. PredictionIO is built on top of several
top
> > > level
> > > > >>> Apache projects as outlined above.
> > > > >>>
> > > > >>> === Known Risks ===
> > > > >>>
> > > > >>> ==== Orphaned products ====
> > > > >>> PredictionIO has a solid and growing community. It is deployed
on
> > > > >>> production environments by companies of all sizes to run
various
> > > kinds
> > > > of
> > > > >>> predictive engines.
> > > > >>>
> > > > >>> In addition to the community contribution to PredictionIO
> > framework,
> > > > the
> > > > >>> community is also actively contributing new engines to the
> Template
> > > > >>> Gallery as well as SDKs and documentation for the project.
> > Salesforce
> > > > is
> > > > >>> committed to utilize and advance the PredictionIO code base
and
> > > support
> > > > >>> its user community.
> > > > >>>
> > > > >>> ==== Inexperience with Open Source ====
> > > > >>> PredictionIO has existed as a healthy open source project
for
> > almost
> > > > two
> > > > >>> years and is the most starred Scala project on GitHub. All
of the
> > > > >>> proposed
> > > > >>> committers have contributed to ASF and Linux Foundation open
> source
> > > > >>> projects. Several current committers on Apache projects and
> Apache
> > > > >>> Members
> > > > >>> are involved in this proposal and intend to provide mentorship.
> > > > >>>
> > > > >>> ==== Homogeneous Developers ====
> > > > >>> The initial list of committers includes developers from several
> > > > >>> institutions, including Salesforce, ActionML, Channel4, USC
as
> well
> > > as
> > > > >>> unaffiliated developers.
> > > > >>>
> > > > >>> ==== Reliance on Salaried Developers ====
> > > > >>> Like most open source projects, PredictionIO receives substantial
> > > > support
> > > > >>> from salaried developers. PredictionIO development is partially
> > > > supported
> > > > >>> by Salesforce.com, but there are many contributors from various
> > other
> > > > >>> companies, and an active mailing list composed of hundreds
of
> > users.
> > > We
> > > > >>> will continue our efforts to ensure stewardship of the project
to
> > be
> > > > >>> independent of salaried developers by meritocratically promoting
> > > those
> > > > >>> contributors to committers.
> > > > >>>
> > > > >>> ==== Relationships with Other Apache Product ====
> > > > >>> PredictionIO relies heavily on top level apache projects
such as
> > > Apache
> > > > >>> Spark, HBase and Hadoop. However it brings a distinguished
> > > > functionality,
> > > > >>> rather than just an abstraction - Machine Learning in a
> > plug-and-play
> > > > >>> fashion.
> > > > >>>
> > > > >>> Compared to Apache Mahout, which focuses on the development
of a
> > wide
> > > > >>> variety of algorithms, PredictionIO offers a platform to
manage
> the
> > > > whole
> > > > >>> machine learning workflow, including data collection, data
> > > preparation,
> > > > >>> modeling, deployment and management of predictive services
in
> > > > production
> > > > >>> environments.
> > > > >>>
> > > > >>> ==== An Excessive Fascination with the Apache Brand ====
> > > > >>> PredictionIO is already a widely known open source project.
This
> > > > proposal
> > > > >>> is not for the purpose of generating publicity. Rather, the
> primary
> > > > >>> benefits to joining Apache are those outlined in the Rationale
> > > section.
> > > > >>>
> > > > >>> === Documentation ===
> > > > >>> PredictionIO boasts rich and live documentation, included
in the
> > code
> > > > >>> repo
> > > > >>> (docs/manual directory), is built with Middleman, and publicly
> > hosted
> > > > at
> > > > >>> https://docs.prediction.io
> > > > >>>
> > > > >>> === Initial Source and Intellectual Property Submission Plan
===
> > > > >>> Currently, the PredictionIO codebase is distributed under
the
> > Apache
> > > > 2.0
> > > > >>> License and hosted on GitHub:
> > > > >>> https://github.com/PredictionIO/PredictionIO
> > > > >>>
> > > > >>> === External Dependencies ===
> > > > >>> PredictionIO has the following external dependencies:
> > > > >>>   * Apache Hadoop 2.4.0 (optional, required only if YARN
and HDFS
> > are
> > > > >>> needed)
> > > > >>>   * Apache Spark 1.3.0 for Hadoop 2.4
> > > > >>>   * Java SE Development Kit 8
> > > > >>>   * and one of the following sets:
> > > > >>>
> > > > >>>     * PostgreSQL 9.1
> > > > >>>
> > > > >>>
> > > > >>> or
> > > > >>>
> > > > >>>
> > > > >>> * MySQL 5.1
> > > > >>>
> > > > >>>   or
> > > > >>>
> > > > >>>
> > > > >>>   * Apache HBase 0.98.6
> > > > >>>
> > > > >>>
> > > > >>> * Elasticsearch 1.4.0
> > > > >>>
> > > > >>> Upon acceptance to the incubator, we would begin a thorough
> > analysis
> > > of
> > > > >>> all transitive dependencies to verify this information and
> > introduce
> > > > >>> license checking into the build and release process by
> integrating
> > > with
> > > > >>> Apache RAT.
> > > > >>>
> > > > >>> === Cryptography ===
> > > > >>> PredictionIO does not include cryptographic code. We utilize
> > standard
> > > > >>> JCE and JSSE APIs provided by the Java Runtime Environment.
> > > > >>>
> > > > >>> === Required Resources ===
> > > > >>> We request that following resources be created for the project
to
> > use
> > > > >>>
> > > > >>> ==== Mailing lists ====
> > > > >>>
> > > > >>> predictionio-private@incubator.apache.org (with moderated
> > > > subscriptions)
> > > > >>>
> > > > >>> predictionio-dev
> > > > >>>
> > > > >>> predictionio-user
> > > > >>>
> > > > >>> predictionio-commits
> > > > >>>
> > > > >>> We will migrate the existing PredictionIO mailing lists.
> > > > >>>
> > > > >>> ==== Git repository ====
> > > > >>> The PredictionIO team would like to use Git for source control,
> due
> > > to
> > > > >>> our
> > > > >>> current use of GitHub.
> > > > >>>
> > > > >>> git://git.apache.org/incubator-predictionio
> > > > >>>
> > > > >>> ==== Documentation ====
> > > > >>> https://predictionio.incubator.apache.org/docs/
> > > > >>>
> > > > >>> ==== JIRA instance ====
> > > > >>> PredictionIO currently uses the GitHub issue tracking system
> > > associated
> > > > >>> with its repository:
> > > > https://github.com/PredictionIO/PredictionIO/issues
> > > > >>> .
> > > > >>> We will migrate to Apache JIRA.
> > > > >>>
> > > > >>> JIRA PREDICTIONIO
> > > > >>> https://issues.apache.org/jira/browse/PREDICTIONIO
> > > > >>>
> > > > >>> ==== Other Resources ====
> > > > >>> * TravisCI for builds and test running.
> > > > >>>
> > > > >>> * PredictionIO's documentation, included in the code repo
> > > (docs/manual
> > > > >>> directory), is built with Middleman and publicly hosted
> > > > >>> https://docs.prediction.io
> > > > >>>
> > > > >>> * A blog to drive adoption and excitement at
> > > > https://blog.prediction.io
> > > > >>>
> > > > >>> === Initial Committers ===
> > > > >>>
> > > > >>> * Pat Ferrell
> > > > >>>
> > > > >>> * Tamas Jambor
> > > > >>>
> > > > >>> * Justin Yip
> > > > >>>
> > > > >>> * Xusen Yin
> > > > >>>
> > > > >>> * Lee Moon Soo
> > > > >>>
> > > > >>> * Donald Szeto
> > > > >>>
> > > > >>> * Kenneth Chan
> > > > >>>
> > > > >>> * Tom Chan
> > > > >>>
> > > > >>> * Simon Chan
> > > > >>>
> > > > >>> * Marco Vivero
> > > > >>>
> > > > >>> * Matthew Tovbin
> > > > >>>
> > > > >>> * Yevgeny Khodorkovsky
> > > > >>>
> > > > >>> * Felipe Oliveira
> > > > >>>
> > > > >>> * Vitaly Gordon
> > > > >>>
> > > > >>> === Affiliations ===
> > > > >>>
> > > > >>> * Pat Ferrell - ActionML
> > > > >>>
> > > > >>> * Tamas Jambor - Channel4
> > > > >>>
> > > > >>> * Justin Yip - independent
> > > > >>>
> > > > >>> * Xusen Yin - USC
> > > > >>>
> > > > >>> * Lee Moon Soo - NFLabs
> > > > >>>
> > > > >>> * Donald Szeto - Salesforce
> > > > >>>
> > > > >>> * Kenneth Chan - Salesforce
> > > > >>>
> > > > >>> * Tom Chan - Salesforce
> > > > >>>
> > > > >>> * Simon Chan - Salesforce
> > > > >>>
> > > > >>> * Marco Vivero - Salesforce
> > > > >>>
> > > > >>> * Matthew Tovbin - Salesforce
> > > > >>>
> > > > >>> * Yevgeny Khodorkovsky - Salesforce
> > > > >>>
> > > > >>> * Felipe Oliveira - Salesforce
> > > > >>>
> > > > >>> * Vitaly Gordon - Salesforce
> > > > >>>
> > > > >>> === Sponsors ===
> > > > >>>
> > > > >>> ==== Champion ====
> > > > >>>
> > > > >>> Andrew Purtell <apurtell at apache dot org>
> > > > >>>
> > > > >>> ==== Nominated Mentors ====
> > > > >>>
> > > > >>> * Andrew Purtell <apurtell at apache dot org>
> > > > >>>
> > > > >>> * James Taylor <jtaylor at apache dot org>
> > > > >>>
> > > > >>> * Lars Hofhansl <larsh at apache dot org>
> > > > >>>
> > > > >>> * Suneel Marthi <smarthi at apache dot org>
> > > > >>>
> > > > >>> * Xiangrui Meng <meng at apache dot org>
> > > > >>>
> > > > >>> * Luciano Resende <lresende at apache dot org>
> > > > >>>
> > > > >>> ==== Sponsoring Entity ====
> > > > >>>
> > > > >>> Apache Incubator PMC
> > > > >>>
> > > > >>
> > > > >>
> > ---------------------------------------------------------------------
> > > > >> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > >> For additional commands, e-mail:
> general-help@incubator.apache.org
> > > > >>
> > > > >>
> > > > > --
> > > > > Jean-Baptiste Onofré
> > > > > jbonofre@apache.org
> > > > > http://blog.nanthrax.net
> > > > > Talend - http://www.talend.com
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> > > > > For additional commands, e-mail: general-help@incubator.apache.org
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message