incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Bonatto Miguel <lucasbona...@gmail.com>
Subject Re: [VOTE] Accept Marvin-AI into Apache Incubator
Date Wed, 22 Aug 2018 19:17:49 GMT
Hi Matt, the paper is scheduled to be released in the proceedings of the
JMLR (Journal of Machine Learning Research) volume 82.

There is code samples in the paper in fact, however that code is only using
Marvin APIs to implement an engine, it does not contain Marvin's
implementation code. By the way, that code is also using Apache Spark APIs.
Let me know if you think it's still an issue.

On Wed, Aug 22, 2018 at 4:00 PM Matt Sicker <boards@gmail.com> wrote:

> What license is the Marvin paper distributed under? There's code samples in
> that paper as well which have no license.
>
> On Tue, 21 Aug 2018 at 16:11, Luciano Resende <luckbr1975@gmail.com>
> wrote:
>
> > Off course, my +1 (binding)
> >
> > On Tue, Aug 21, 2018 at 10:43 AM Luciano Resende <luckbr1975@gmail.com>
> > wrote:
> >
> > > After the initial discussion, please vote on the acceptance of
> Marvin-AI
> > > Project for incubation at the Apache Incubator. The full proposal is
> > > available at the end of this message and on the wiki at :
> > >
> > > https://wiki.apache.org/incubator/Marvin-AI
> > >
> > > Please cast your votes:
> > >
> > > [ ] +1, bring Marvin-AI into Incubator
> > > [ ] +0, I don't care either way
> > > [ ] -1, do not bring Marvin-AI into Incubator, because...
> > >
> > > The vote is open for the next 72 hours and only votes from the
> > > Incubator PMC are binding.
> > >
> > > ===
> > >
> > > = Marvin-AI =
> > >
> > > == Abstract ==
> > >
> > > Marvin-AI is an open-source artificial intelligence (AI) platform that
> > > helps data scientists, prototype and productionalize complex solutions
> > with
> > > a scalable, low-latency, language-agnostic, and standardized
> architecture
> > > while simplifies the process of exploration and modeling.
> > >
> > > == Proposal ==
> > >
> > > Marvin helps non-experienced developers create industry-grade AI
> > > applications. It has three core components:  a development environment
> to
> > > be used during data exploration and hypothesis validation (Toolbox), a
> > > library which should be extended to create Marvin engines, and a Scala
> > > application server which interprets engines (Engine Executor).
> > > A basic premise of Marvin is that it should be language-agnostic, able
> to
> > > interpret engines implemented in different programming languages.
> > >
> > > == Background ==
> > >
> > > The Marvin AI project was initiated as an internal project at B2W
> Digital
> > > (Brazil), the largest e-commerce company in Latin America. Nowadays, it
> > is
> > > used by all data scientists within the B2W team. Oftentimes, data
> > > scientists don't have an extensive background in software engineering,
> > yet
> > > are in charge of creating AI applications that need to scale to high
> > > throughput and provide millisecond-level response times. At B2W, Marvin
> > AI
> > > plays an important role in this process, abstracting advanced software
> > > engineering procedures, allowing data scientists to focus on their
> > > knowledge domain.
> > >
> > > == Rationale ==
> > >
> > > With recent advances in computer architecture and a corresponding
> > increase
> > > in the amount of data generated by always-connected devices, AI
> > algorithms
> > > offer a solution to problems that have long troubled modern
> corporations.
> > > Since AI developers come from various fields, such as statistics,
> > physics,
> > > and math, there exists a strong need for platforms which enable them to
> > > move from prototypes to enterprise applications. Although some tools
> > claim
> > > to offer this service, in reality, there is no reliable open-source
> > > solution.
> > >
> > > == Initial Goals ==
> > >
> > > The initial goals will most likely be to merge the existing codebase
> into
> > > a single repository, migrate it to Apache, and then integrate with the
> > > Apache development process. Furthermore, we plan for incremental
> > > development and releases, as per Apache guidelines.
> > >
> > > == Current Status ==
> > >
> > > === Meritocracy ===
> > >
> > > Marvin already works under principles of meritocracy. Today, Marvin
> > > already has some contributors that are part of other institutions.
> > Although
> > > there is no formal process defined to become a committer, contributors
> > that
> > > make major changes/improvements to the platform are naturally granted
> > write
> > > access to the repository.
> > >
> > >
> > > === Community ===
> > >
> > > Acceptance into the Apache foundation would substantially boost both
> > > Marvin's user and developer communities. The current community
> includes a
> > > few experienced developers that have either academic or professional
> > > experience with AI. The community is largely comprised of data
> scientists
> > > working at B2W and other companies such as Cloudera, MIT, Qume Labs,
> > > Laguro.com, and CBYK. Also, there is a  meetup group of hundreds of
> users
> > > who meet regularly to exchange ideas about Marvin and, more generally,
> > AI.
> > >
> > > Reference to the group: https://www.meetup.com/marvin-ai/members/
> > >
> > > === Core Developers ===
> > >
> > > The core developers for Marvin are listed in the contributor's list and
> > > initial PPMC below. These lists include B2W employees, MIT students,
> > UFSCAR
> > > researchers, independent contributors, and some employees of other
> > > companies like Cloudera, Qume Labs, Laguro.com, and CBYK.
> > >
> > > === Alignment ===
> > >
> > > The initial committers strongly believe that by being part of the
> Apache
> > > Software Foundation, Marvin AI will be part of a comprehensive suite
> for
> > AI
> > > applications that can process big data and enable enterprises to
> extract
> > > value from their data lakes. Also, we hope that by integrating with
> other
> > > Apache projects such as Apache Spark, Apache Hadoop; that this will
> > foster
> > > additional collaboration between these projects furthering the already
> > > existing integration points and expanding the community of
> contributors.
> > >
> > >
> > > == Known Risks ==
> > >
> > > === Orphaned products ===
> > >
> > > Given the current maturity of Marvin and how well it has been received
> at
> > > technical conferences, the risk of the project being abandoned is
> > minimal.
> > > AI is not academia-exclusive anymore, and as enterprises start to add
> > > data-science pipelines to their applications, demand for Marvin will
> only
> > > increase.
> > >
> > > === Inexperience with Open Source ===
> > >
> > > Marvin AI has been an open-source project since October 2017. The
> project
> > > was started in a company where open-source culture is foundational. B2W
> > > Digital runs the largest e-commerce in Latin America on top of
> > open-source
> > > projects.
> > >
> > > === Reliance on Salaried Developers ===
> > >
> > > Marvin AI receives substantial efforts from salaried developers -- a
> few
> > > of which were hired by companies to work exclusively for the project --
> > but
> > > the majority devote "after-hours" or spare time to this project. Some
> > > developers are graduate students that contribute in their free time at
> > > school.
> > >
> > > === Relationships with Other Apache Products ===
> > >
> > > Marvin integrates with several Apache products, such as Hadoop (HDFS)
> and
> > > Spark. Marvin shares some similar features with PredictionIO,
> > specifically
> > > the model application server and a design pattern that was inspired by
> > the
> > > DASE. Despite these similarities, Marvin is catered towards a different
> > > clientele (data scientists), and for that reason, it includes many
> > critical
> > > features that are not provided by PredictionIO.
> > >
> > > === An Excessive Fascination with the Apache Brand ===
> > >
> > > While the ASF brand will undoubtedly help Marvin become a successful
> > > project, Marvin is already gaining traction at companies around the
> > globe.
> > >
> > > == Documentation ==
> > >
> > > http://www.marvin-ai.org
> > >
> > >
> > > == Initial Source ==
> > >
> > > The current codebase is available at http://github.com/marvin-ai. This
> > is
> > > practically the same code that will be migrating to the Apache
> > Foundation,
> > > the notable difference being that the multiple repositories will be
> > merged
> > > into a single repository (if necessary).
> > >
> > > These are the main repositories and a very simplified explanation about
> > > each one:
> > >
> > > '''Main repositories'''
> > >
> > >  * marvin-ai/marvin-python-toolbox - Data Science toolbox that helps in
> > > the creation of new ML engines
> > >  * marvin-ai/marvin-engine-executor - Component responsible for
> > > interpreting, serving and managing Marvin engines
> > >  * marvin-ai/marvin-public-engines - Marvin engine examples to help new
> > > Marvin users to build engines
> > >  * marvin-ai/marvin-platform-book - Documentation in GitHub book site
> > > format
> > >
> > > '''Secondary repositories (Experimental and Initial)'''
> > >  * marvin-ai/marvin-vagrant-dev - Development environment that uses
> > > VirtualBox and vagrant to non mac and Linux users;
> > >  * marvin-ai/marvin-paper - Source code (latex format) of the first
> > Marvin
> > > paper published in PAPIS.io conference in Boston.
> > >  * marvin-ai/marvin-cluster-admin - Admin module responsible to manage
> > > Marvin cluster;
> > >  * marvin-ai/marvin-automl - AutoML module responsible to help data
> > > scientist to build machine learning models with a very simple visual
> > > interface;
> > >
> > >
> > > == External Dependencies ==
> > >
> > > It is very likely that all our dependencies are using either the Apache
> > or
> > > MIT license. Upon acceptance to the incubator, we would begin a
> thorough
> > > analysis of all transitive dependencies to verify this fact and
> introduce
> > > license checking into the build and release process.
> > >
> > > == Required Resources ==
> > >
> > > === Mailing lists ===
> > >
> > >   * private@marvin.incubator.apache.org (with moderated subscriptions)
> > >   * dev@marvin.incubator.apache.org
> > >   * commits@marvin.incubator.apache.org
> > >
> > >
> > > === Git Repositories ===
> > >
> > >   * https://git-wip-us.apache.org/repos/asf/incubator-marvin.git
> > >
> > > === Issue Tracking ===
> > >
> > >   * JIRA (MARVIN)
> > >
> > > == Initial Committers ==
> > >
> > >  * Lucas Bonatto Miguel <lucasbonatto@gmail.com> - Qume Labs
> (California
> > > - USA)
> > >  * Daniel Takabayashi <daniel.takabayashi@gmail.com> - B2W Digital
> (São
> > > Paulo - BR) / Laguro.com (California - USA)
> > >  * Bruno Piraja <bruno.piraja@b2wdigital.com> - B2W Digital (São
> Paulo -
> > > BR)
> > >  * Zhang Yifei <zhang.yifei@b2wdigital.com> - B2W Digital (São Paulo
-
> > BR)
> > >  * Harrison Wang <hwang123@mit.edu> - MIT (USA)
> > >  * Brody West <brodyw@mit.edu> - MIT (USA)
> > >  * Rafael Novello <rafael.novello@b2wdigital.com> - B2W Digital (São
> > > Paulo - BR)
> > >  * Willian Leite <willian.leite@cbyk.com.br> - CBYK (São Paulo - BR)
> > >  * Danilo Nunes <nunesdanilo@gmail.com> - Qume Labs (California - USA)
> > >  * Alan Silva <alan.silva@cloudera.com> Cloudera (USA)
> > >  * Jeremy Elster <jeremy.elster@b2wdigital.com> - B2W Digital (São
> Paulo
> > > - BR)
> > >
> > >
> > > == Sponsors ==
> > >
> > > === Champion ===
> > >
> > >  * Luciano Resende - (lresende)
> > >
> > > === Nominated Mentors ===
> > >
> > >  * Luciano Resende - (lresende)
> > >  * Jim Jagielski - (jim)
> > >  * William Colen - (colen)
> > >
> > > === Sponsoring Entity ===
> > > We would like to propose the Apache Incubator to sponsor this project.
> > >
> > >
> > > --
> > > Luciano Resende
> > > http://twitter.com/lresende1975
> > > http://lresende.blogspot.com/
> > >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
>
>
> --
> Matt Sicker <boards@gmail.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message