incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: [DISCUSS] [PROPOSAL] Myriad for Apache Incubator
Date Mon, 16 Feb 2015 02:58:35 GMT
In case there is any doubt, +1 from me!



On Fri, Feb 13, 2015 at 5:15 PM, Luciano Resende <luckbr1975@gmail.com>
wrote:

> On Fri, Feb 13, 2015 at 5:06 PM, Adam Bordelon <adam@mesosphere.io> wrote:
>
> > Hello friends,
> >
> > The Myriad team and I would like to propose the Myriad project for
> > inclusion in the Apache Incubator.
> > Full text of the proposal is below. I can add it to the incubator wiki as
> > well, if desired.
> > Please review and discuss. If there are no major concerns, I will call
> for
> > a Vote after a week.
> >
> > Cheers,
> > -Adam-
> > me@apache
> >
> > ==========================================================
> > Apache Myriad Proposal
> >
> > * Abstract
> > Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
> together
> > on the same cluster and allows dynamic resource allocations across both
> > Hadoop and other applications running on the same physical data center
> > infrastructure.
> >
> > * Proposal
> > The vision of Myriad is to provide a comprehensive framework to ensure
> > Apache Hadoop YARN and Apache Mesos can interoperate with minimal changes
> > on either side and prevent the static fragmentation of data center
> > resources.
> >
> > * Background
> > Project Myriad is the first resource management framework that allows big
> > data developers to run YARN-based Hadoop jobs alongside other
> applications
> > and services in production. ebay Inc., MapR, and Mesosphere jointly built
> > Myriad (available on Github at https://github.com/mesos/myriad) with the
> > vision of freeing big data jobs from siloed clusters and consolidating
> > infrastructure into a single pool of resources for greater utilization
> and
> > operational efficiency. Several companies including Twitter have
> expressed
> > interest in Myriad and have begun testing it.
> >
> > * Rationale
> > Many Hadoop users are building larger clusters (data lake/data hub
> > architectures) that support multiple workloads - made possible by the
> > advent of Apache Hadoop YARN. As the clusters grow in size and
> importance,
> > they become an important application within the broader datacenter. At
> the
> > same time, Apache Mesos enables efficient resource isolation and sharing
> > across distributed applications for the broader data center, for instance
> > MPI, Spark, long running web services, build/test infrastructure,
> > traditional linux applications/scripts, and others (including arbitrary
> > docker images).
> >
> > Myriad aims to enable co-existence of Apache Hadoop YARN and Apache Mesos
> > on the same physical data center resources, reducing fragmentation of
> data
> > center resources.
> >
> > * Project Goals
> > ** Initial Goals
> > - Run Myriad alongside Apache Hadoop YARN and Apache Mesos to allow
> policy
> > based allocation of data center resources across Apache Hadoop and other
> > distributed applications
> > - Ensure YARN based execution frameworks work without any changes when
> > running alongside Myriad. YARN Applications will continue to interact and
> > run on top of YARN and can choose to be unaware of Myriad.
> > - Ensure Mesos based execution frameworks work without any changes when
> > running alongside Myriad. Mesos applications will continue to interact
> and
> > run on Mesos and can choose to be unaware of Myriad.
> > - Provide isolation for multi-tenancy.
> >   - Use linux cgroups (and optionally Docker-like technologies to ease
> > packaging, deployment and broader isolation) so that multiple YARN
> clusters
> > can run in their own space and are isolated from each other. YARN’s RM
> and
> > NMs are dockerized.
> > - Myriad should be able to manage full YARN lifecycle:
> >   - Bring up YARN (RM, NM)
> >   - Scale Up/Down YARN
> >   - Release resources and shut down YARN
> >
> > ** Longer Term Goals
> > - Allow fine-grained dynamic allocation of resources to Hadoop including
> > the ability to scale up and scale down the cluster.
> >   - Provide different policies to allow downsizing running applications
> on
> > Hadoop when resources are taken away from it.
> >   - Provide a framework so the downsizing policy is pluggable and users
> can
> > write their own implementations.
> > - Allow multiple versions of Apache Hadoop to run on the same physical
> > infrastructure
> > - Allow workload portability - ability to migrate YARN workloads across
> > various cloud infrastructures seamlessly (e.g. GCE, AWS, etc)
> > - Security:
> >   - Authentication Requirements:
> >     - Support basic CRAM-MD5 password authentication between Myriad and
> > Mesos. Additional authentication mechanisms may be supported in the
> future.
> >     - Traditional user authentication with Hadoop’s HTTP web-consoles
> > should work as usual.
> >   - Authorization:
> >     - Only authorized users are allowed to launch YARN clusters.  Mesos
> > allows to specify which framework principal is allowed to register as a
> > particular role.
> >   - Encryption on wire:
> >     - All control traffic to/from Myriad/Mesos
> > - Logs
> >   - Audits (where to store them)
> >     - Log all major activities/events with audit trail - who, what, when,
> > result
> >     - Launching YARN/RM
> >     - Launching NM’s
> >     - Downsizing NM’s
> >     - Terminating YARN/RM
> >   - What to do with old logs?
> >   - Debuggability/Visibility
> >     - Hooks to identify different YARN cluster lifecycles (yarn-id?)
> > - GUI: Capability to scale-up and scale-down by selecting nodes and
> > providing a scale-up/scale-down factor.
> >
> > * Architectural Overview
> > The following diagram illustrates the high level architecture. YARN (with
> > Myriad) is registered as a framework with Mesos master along with
> possibly
> > other Mesos frameworks. This enables YARN to share cluster resources with
> > other Mesos frameworks providing elasticity of resources between Hadoop
> > workloads and Mesos frameworks.
> >
> > See
> >
> >
> https://github.com/mesos/myriad/blob/phase1/docs/images/high-level-architecture.png
> >
> > * Current Status
> > Myriad is under active development. Key components of Myriad are:
> > ** Myriad Resource Manager (RM) Plugin
> > - Plugs into Resource Manager Java process via yarn-site.xml
> configuration.
> > - Registers Myriad as a framework with Mesos. Receives resource offers
> from
> > Mesos.
> > - Monitors YARN’s application pipeline and scheduling events to drive
> > scale-up or scale-down decisions for Hadoop.
> > - Exposes REST APIs to help admins control Hadoop/YARN’s resource
> > consumption. Currently the following APIs are supported:
> >   - Scale Up (e.g. “launch 4 Node Manager instances with 10G/6CPU
> > capacity”)
> >   - Scale Down (e.g. “kill 2 Node Manager instances with 10G/6CPU
> > capacity”)
> >
> > ** Myriad Mesos Executor
> > - Launched on a Mesos slave node by Myriad RM plugin via Mesos.
> > - Responsible for launching Node Manager process with appropriate
> > capacities configured in yarn-site.xml.
> > - Mounts YARN’s cgroup hierarchy under Mesos’ cgroup hierarchy in case
> > YARN’s cgroups are enabled.
> >
> > Currently, a working prototype/demo had been built for the goals listed
> > under the “Initial Goals” section. Open issues and enhancements are
> tracked
> > at https://github.com/mesos/myriad/issues. Myriad is not yet tested for
> > production use.
> >
> > ** Meritocracy
> > We plan to invest in supporting a meritocracy. We will discuss the
> > requirements in a public forum. Several companies have already expressed
> > interest in this project, and we intend to invite developers to
> contribute
> > and gain karma. We will encourage and monitor community participation so
> > that privileges can be extended to those that contribute.
> >
> > ** Community
> > We are happy to report that there are existing Apache committers and
> > corporate users who are closely involved in the project already. We hope
> to
> > extend the user and developer base further in the future and build a
> solid
> > open source community around Myriad, growing the community and adding
> > committers following the Apache Way.
> >
> > ** Core Developers
> > The initial technology was built independently by ebay and MapR. ebay
> built
> > the technology in consultation with Ben Hindman. MapR built a working
> > prototype in tight consultation and mentorship with Mesosphere.
> >
> > ** Alignment
> > The initial committers strongly believe that Apache Hadoop YARN and
> Apache
> > Mesos will gain broad adoption and therefore a framework to allow for a
> > co-existence of these frameworks that is transparent to applications
> > written for YARN and Mesos will serve the needs of the broader community.
> >
> > * Known Risks
> >
> > ** Inexperience with Open Source
> > Initial Myriad committers have varying levels of experience using and
> > contributing to Open Source projects, however by working with our mentors
> > and the Apache community we believe we will be able to conduct ourselves
> in
> > accordance with Apache Incubator guidelines. The close relationship
> between
> > the Myriad team and Apache Mesos and Apache Hadoop means there is an
> > awareness of the incubation process and a willingness to embrace The
> Apache
> > Way.
> >
> > ** Homogenous Developers
> > There is already diversity in the core developer community as they are
> > employed by three different and independent companies viz. ebay inc.,
> MapR,
> > and Mesosphere. However, there will continue to be an emphasis on
> > increasing the diversity of the developer community.
> >
> > ** Reliance on Salaried Developers
> > Currently, the core developers are paid to work on Myriad. However, once
> > the project has a community built around it, we expect to get committers,
> > contributors and community from outside the current participating
> > organizations.
> >
> > ** Relationships with Other Apache Products
> > Myriad implements interfaces from both Apache YARN and Apache Mesos, and
> > requires both to be present so that Myriad can coordinate dynamic
> resource
> > sharing between the two.
> >
> > ** An Excessive Fascination with the Apache Brand
> > While we respect the reputation of the Apache brand and have no doubts
> that
> > it will attract contributors and users, our interest is primarily to give
> > Myriad a solid home as an open source project following an established
> > development model. We have also given reasons in the Rationale and
> > Alignment sections.
> >
> > * Documentation
> > Documentation is included in a docs directory of the repository (See
> > https://github.com/mesos/myriad/tree/phase1/docs), and currently details
> > how Myriad works, developing the project, auto-scaling a YARN cluster,
> the
> > Myriad REST API, and more. We will improve docs at every revision drop.
> >
> > * Initial Source
> > The Myriad codebase has been posted on GitHub for review and licensed
> under
> > an Apache v2 license.
> > https://github.com/mesos/myriad
> >
> > * Source and IP Submission Plan
> > During incubation, the codebase will be available at
> > https://github.com/apache/incubator-myriad/ and contributors will commit
> > appropriate contribute license agreements.
> >
> > * External Dependencies
> > All Myriad dependencies have Apache compatible licenses.
> >
> > * Cryptography
> > Myriad doesn’t use cryptography itself. Hadoop and Mesos projects,
> however,
> > use standard API’s and tools for SSH And SSL communication where
> necessary.
> >
> > * Required Resources
> > ** Mailing Lists
> > - myriad-private for private PMC conversations
> > - myriad-dev
> > - myriad-commits
> > - myriad-user
> >
> > ** Version Control
> > We prefer to use Git as our source control system: git://
> > git.apache.org/myriad
> >
> > ** Issue Tracking
> > JIRA Myriad (MYRIAD)
> >
> > * Initial Committers
> > - Santosh Marella (smarella at mapr dot com)
> > - Mohit Soni (mohitsoni1989 at gmail dot com)
> > - Adam Bordelon (me at apache dot org) *
> > - Meghdoot Bhattacharya  ( mbhattacharya at paypal dot com)
> > - Anoop Dawar (anoopdawar at gmail dot com)
> > - Jim Scott (jim at 13ways dot com)
> > - Ken Sipe (kensipe at gmail dot com)
> >
> > * Affiliations
> > - Santosh Marella, MapR
> > - Mohit Soni, ebay Inc.
> > - Adam Bordelon, Mesosphere
> > - Meghdoot Bhattacharya, ebay Inc.
> > - Anoop Dawar, MapR
> > - Jim Scott, MapR
> > - Ken Sipe, Mesosphere
> >
> > * Sponsors
> > ** Champion (Proposal)
> > - Ben Hindman (benh at apache dot org)
> >
> > ** Nominated Mentors
> > - Ben Hindman (benh at apache dot org) - Mesosphere
> > - Danese Cooper (danese at apache dot org) - ebay, Inc.
> > - Ted Dunning (tdunning at apache dot org) - MapR
> >
> > ** Sponsoring Entity
> > Apache Incubator
> >
>
>
> Interesting, +1, If you guys need an extra mentor (or committer) please
> count me in.
>
> --
> Luciano Resende
> http://people.apache.org/~lresende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message