incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Saputra <henry.sapu...@gmail.com>
Subject Re: [PROPOSAL] Ivory - Hadoop data management and processing platform
Date Sat, 16 Mar 2013 01:00:04 GMT
+1 =)




On Fri, Mar 15, 2013 at 5:22 PM, Joe Schaefer <joe_schaefer@yahoo.com>wrote:

> Can we pretty-please do this *before* resources
> are requested, just to save us poor infra saps
> the trouble of renaming everything?
>
>
>
>
>
> >________________________________
> > From: Jakob Homan <jghoman@gmail.com>
> >To: general@incubator.apache.org
> >Sent: Friday, March 15, 2013 8:18 PM
> >Subject: Re: [PROPOSAL] Ivory - Hadoop data management and processing
> platform
> >
> >As part of Incubation a suitable name search will be done to verify the
> >name's appropriate.  I imagine Ivory would fail this test based on the
> >prior project, so this Ivory would need to find a new name.
> Alternatively,
> >before the vote, the Ivory folks can find another name.  This has happened
> >before (Howl -> HCatalog), so it's not a huge reason to be concerned.
> >
> >
> >On Fri, Mar 15, 2013 at 5:15 PM, Dmitriy Ryaboy <dvryaboy@gmail.com>
> wrote:
> >
> >> It would be awfully nice of you not to stomp on another hadoop ecosystem
> >> project's google-fu when your project becomes very successful and
> admired
> >> across the hadoopverse :)
> >>
> >> Ivory isn't a fly-by-night project someone threw up on github -- it's
> >> generated over a dozen peer-reviewed papers, and has many watchers and
> dev
> >> forks.
> >>
> >> I don't have a vote here, but I'd say that yes, this will lead to
> confusion
> >> when people look for hadoop ivory.
> >>
> >> D
> >>
> >>
> >> On Fri, Mar 15, 2013 at 11:09 AM, Seetharam Venkatesh <
> >> venkatesh@innerzeal.com> wrote:
> >>
> >> > Hi Henry,
> >> >
> >> > Is there a concern with the current name? The closest is a tool for
> >> > Information Retrieval. Not sure if there is an overlap.  We will also
> >> bring
> >> > this up with the champion and mentors to see if this needs to be vet
> with
> >> > trademarks folks as well.
> >> >
> >> > Your suggestions are welcome.
> >> >
> >> > Thanks!
> >> >
> >> >
> >> > On Fri, Mar 15, 2013 at 10:18 AM, Henry Saputra <
> henry.saputra@gmail.com
> >> > >wrote:
> >> >
> >> > > HI Srikanth,
> >> > >
> >> > > So does the Ivory name stay or once the podling near graduation it
> will
> >> > try
> >> > > to find another name?
> >> > >
> >> > > - Henry
> >> > >
> >> > >
> >> > > On Fri, Mar 15, 2013 at 12:34 AM, Srikanth Sundarrajan <
> >> > > srikanth.sundarrajan@inmobi.com> wrote:
> >> > >
> >> > > > Made few edits to the proposal (
> >> > > > http://wiki.apache.org/incubator/IvoryProposal) as per the
> feedback
> >> > > > received so far.
> >> > > >
> >> > > > Regards
> >> > > > Srikanth Sundarrajan
> >> > > >
> >> > > > = Ivory Proposal =
> >> > > >
> >> > > > == Abstract ==
> >> > > > Ivory is a data processing and management solution for Hadoop
> >> designed
> >> > > > for data motion, coordination of data pipelines, lifecycle
> >> management,
> >> > > > and data discovery. Ivory enables end consumers to quickly onboard
> >> > > > their data and its associated processing and management tasks
on
> >> > > > Hadoop clusters.
> >> > > >
> >> > > > == Proposal ==
> >> > > > Ivory will enable easy data management via declarative mechanism
> for
> >> > > > Hadoop. Users of Ivory platform simply define infrastructure
> >> > > > endpoints, data sets and processing rules declaratively. These
> >> > > > declarative configurations are expressed in such a way that the
> >> > > > dependencies between these configured entities are explicitly
> >> > > > described. This information about inter-dependencies between
> various
> >> > > > entities allows Ivory to orchestrate and manage various data
> >> > > > management functions.
> >> > > >
> >> > > > The key use cases that Ivory addresses are:
> >> > > >  * Data Motion
> >> > > >  * Process orchestration and scheduling
> >> > > >  * Policy-based Lifecycle Management
> >> > > >  * Data Discovery
> >> > > >  * Operability/Usability
> >> > > >
> >> > > > With these features it is possible for users to onboard their
data
> >> > > > sets with a comprehensive and holistic understanding of how,
when
> and
> >> > > > where their data is managed across its lifecycle. Complex
> functions
> >> > > > such as retrying failures, identifying possible SLA breaches
or
> >> > > > automated handling of input data changes are now simple
> directives.
> >> > > > All the administrative functions and user level functions are
> >> > > > available via RESTful APIs. CLI is simply a wrapper over the
> RESTful
> >> > > > APIs.
> >> > > >
> >> > > > == Background ==
> >> > > > Hadoop and its ecosystem of products have made storing and
> processing
> >> > > > massive amounts of data commonplace. This has enabled numerous
> >> > > > organizations to gain valuable insights that they never could
have
> >> > > > achieved in the past. While it is easy to leverage Hadoop for
> >> > > > crunching large volumes of data, organizing data, managing life
> cycle
> >> > > > of data and processing data is fairly involved. This is solved
> >> > > > adequately well in a classic data platform involving data
> warehouses
> >> > > > and standard ETL (extract-transform-load) tools, but remains
> largely
> >> > > > unsolved today. In addition to data processing complexities,
> Hadoop
> >> > > > presents new sets of challenges and opportunities relating to
> >> > > > management of data.
> >> > > >
> >> > > > Data Management on Hadoop encompasses data motion, process
> >> > > > orchestration, lifecycle management, data discovery, etc. among
> other
> >> > > > concerns that are beyond ETL. Ivory is a new data processing
and
> >> > > > management platform for Hadoop that solves this problem and
> creates
> >> > > > additional opportunities by building on existing components within
> >> the
> >> > > > Hadoop ecosystem (ex. Apache Oozie, Apache Hadoop DistCp etc.)
> >> without
> >> > > > reinventing the wheel. Ivory has been in production at InMobi,
> going
> >> > > > on its second year and has been managing hundreds of feeds and
> >> > > > processes.
> >> > > >
> >> > > > Ivory is being developed by engineers employed with InMobi and
> >> > > > Hortonworks. This platform addition will increase the adoption
of
> >> > > > Apache Hadoop by driving data management tractable for end users.
> We
> >> > > > are therefore proposing to make Ivory an Apache open source
> project.
> >> > > >
> >> > > > == Rationale ==
> >> > > > The Ivory project aims to improve the usability of Apache Hadoop.
> As
> >> a
> >> > > > result Apache Hadoop will grow its community of users by
> increasing
> >> > > > the places Hadoop can be utilized and the use cases it will
> solve. By
> >> > > > developing Ivory in Apache we hope to gather a diverse community
> of
> >> > > > contributors, helping to ensure that Ivory is deployable for
a
> broad
> >> > > > range of scenarios. Members of the Hadoop development community
> will
> >> > > > be able to influence Ivory’s roadmap, and contribute to it.
We
> >> believe
> >> > > > having Ivory as part of the Apache Hadoop ecosystem will be a
> great
> >> > > > benefit to all of Hadoop's users.
> >> > > >
> >> > > > == Current Status ==
> >> > > > Ivory is widely deployed in production within InMobi and moving
> on to
> >> > > > its second year. A version with a valuable set of features is
> >> > > > developed by the list of initial committers and is hosted on
> github.
> >> > > >
> >> > > > === Meritocracy ===
> >> > > > Our intent with this incubator proposal is to start building
a
> >> diverse
> >> > > > developer community around Ivory following the Apache meritocracy
> >> > > > model. We have wanted to make the project open source and
> encourage
> >> > > > contributors from multiple organizations from the start. We plan
> to
> >> > > > provide plenty of support to new developers and to quickly recruit
> >> > > > those who make solid contributions to committer status.
> >> > > >
> >> > > > === Community ===
> >> > > > We are happy to report that the initial team already represents
> >> > > > multiple organizations. We hope to extend the user and developer
> base
> >> > > > further in the future and build a solid open source community
> around
> >> > > > Ivory.
> >> > > >
> >> > > > === Core Developers ===
> >> > > > Ivory is currently being developed by three engineers from InMobi
> –
> >> > > > Srikanth Sunderrajan, Shwetha G S, and Shaik Idris, two
> Hortonworks
> >> > > > employees – Sanjay Radia and Venkatesh Seetharam. In addition,
> Rohini
> >> > > > Palaniswamy and Thiruvel Thirumoolan, were also involved in the
> >> > > > initial design discussions. Srikanth, Shwetha and Shaik are the
> >> > > > original developers. All the engineers have built two generations
> of
> >> > > > Data Management on Hadoop, having deep expertise in Hadoop and
are
> >> > > > quite familiar with the Hadoop Ecosystem. Samarth Gupta &
Rishu
> >> > > > Mehrothra, both from InMobi have build the QA automation for
> Ivory.
> >> > > >
> >> > > > === Alignment ===
> >> > > > The ASF is a natural host for Ivory given that it is already
the
> home
> >> > > > of Hadoop, Pig, Knox, HCatalog, and other emerging “big data”
> >> software
> >> > > > projects. Ivory has been designed to solve the data management
> >> > > > challenges and opportunities of the Hadoop ecosystem family of
> >> > > > products. Ivory fills the gap that Hadoop ecosystem has been
> lacking
> >> > > > in the areas of data processing and data lifecycle management.
> >> > > >
> >> > > > == Known Risks ==
> >> > > >
> >> > > > === Orphaned products & Reliance on Salaried Developers ===
> >> > > > The core developers plan to work full time on the project. There
> is
> >> > > > very little risk of Ivory getting orphaned. Ivory is in use by
> >> > > > companies we work for so the companies have an interest in its
> >> > > > continued vitality.
> >> > > >
> >> > > > === Inexperience with Open Source ===
> >> > > > All of the core developers are active users and followers of
open
> >> > > > source. Srikanth Sundarrajan has been contributing patches to
> Apache
> >> > > > Hadoop and Apache Oozie, Shwetha GS has been contributing patches
> to
> >> > > > Apache Oozie.  Seetharam Venkatesh is a committer on Apache Knox.
> >> > > > Sharad Agarwal, Amareshwari SR (also a Apache Hive PMC member)
and
> >> > > > Sanjay Radia are PMC members on Apache Hadoop.
> >> > > >
> >> > > > === Homogeneous Developers ===
> >> > > > The current core developers are from diverse set of organizations
> >> such
> >> > > > as InMobi and Hortonworks. We expect to quickly establish a
> developer
> >> > > > community that includes contributors from several corporations
> post
> >> > > > incubation.
> >> > > >
> >> > > > === Reliance on Salaried Developers ===
> >> > > > Currently, most developers are paid to do work on Ivory but few
> are
> >> > > > contributing in their spare time. However, once the project has
a
> >> > > > community built around it post incubation, we expect to get
> >> committers
> >> > > > and developers from outside the current core developers.
> >> > > >
> >> > > > === Relationships with Other Apache Products ===
> >> > > > Ivory is going to be used by the users of Hadoop and the Hadoop
> >> > > > ecosystem in general.
> >> > > >
> >> > > > === A Excessive Fascination with the Apache Brand ===
> >> > > > While we respect the reputation of the Apache brand and have
no
> >> doubts
> >> > > > that it will attract contributors and users, our interest is
> >> primarily
> >> > > > to give Ivory a solid home as an open source project following
an
> >> > > > established development model. We have also given reasons in
the
> >> > > > Rationale and Alignment sections.
> >> > > >
> >> > > > == Documentation ==http://wiki.apache.org/incubator/IvoryProposal
> >> > > >
> >> > > > == Initial Source ==
> >> > > > The source is currently in github repository at:
> >> > > > https://github.com/sriksun/Ivory
> >> > > >
> >> > > > == Source and Intellectual Property Submission Plan ==
> >> > > > The complete Ivory code is under Apache Software License 2.
> >> > > >
> >> > > > == External Dependencies ==
> >> > > > The dependencies all have Apache compatible licenses. These
> include
> >> > > > BSD, MIT licensed dependencies.
> >> > > >
> >> > > > == Cryptography ==
> >> > > > None
> >> > > >
> >> > > > == Required Resources ==
> >> > > >
> >> > > > === Mailing lists ===
> >> > > >
> >> > > >  * ivory-dev AT incubator DOT apache DOT org
> >> > > >  * ivory-commits AT incubator DOT apache DOT org
> >> > > >  * ivory-user AT incubator apache DOT org
> >> > > >  * ivory-private AT incubator DOT apache DOT org
> >> > > >
> >> > > > === Subversion Directory ===
> >> > > > Git is the preferred source control system: git://
> >> git.apache.org/ivory
> >> > > >
> >> > > > === Issue Tracking ===
> >> > > > JIRA IVORY
> >> > > >
> >> > > > == Initial Committers ==
> >> > > >  * Srikanth Sundarrajan (Srikanth.Sundarrajan AT inmobi DOT com)
> >> > > >  * Shwetha GS (shwetha.gs AT inmobi DOT com)
> >> > > >  * Shaik Idris (shaik.idris AT inmobi DOT com)
> >> > > >  * Venkatesh Seetharam (Venkatesh AT apache DOT org)
> >> > > >  * Sanjay Radia (sanjay AT apache DOT org)
> >> > > >  * Sharad Agarwal (sharad AT apache DOT org)
> >> > > >  * Amareshwari SR (amareshwari AT apache DOT org)
> >> > > >  * Samarth Gupta (samarth.gupta AT inmobi DOT com)
> >> > > >  * Rishu Mehrothra (rishu.mehrothra AT inmobi DOT com)
> >> > > >
> >> > > > == Affiliations ==
> >> > > >  * Srikanth Sundarrajan (InMobi)
> >> > > >  * Shwetha GS (InMobi)
> >> > > >  * Shaik Idris (InMobi)
> >> > > >  * Venkatesh Seetharam (Hortonworks Inc.)
> >> > > >  * Sanjay Radia (Hortonworks Inc.)
> >> > > >  * Sharad Agarwal (InMobi)
> >> > > >  * Amareshwari SR (InMobi)
> >> > > >  * Samarth Gupta (InMobi)
> >> > > >  * Rishu Mehrothra (InMobi)
> >> > > >
> >> > > > == Sponsors ==
> >> > > >
> >> > > > === Champion ===
> >> > > >  * Arun C Murthy (acmurthy at apache dot org)
> >> > > >
> >> > > > === Nominated Mentors ===
> >> > > >  * Alan Gates (gates AT apache DOT org)
> >> > > >  * Chris Douglas (cdouglas AT apache DOT org)
> >> > > >  * Devaraj  Das (ddas AT apache DOT org)
> >> > > >  * Owen O’Malley (omalley AT apache DOT org)
> >> > > >
> >> > > > === Sponsoring Entity ===
> >> > > > Incubator PMC
> >> > > >
> >> > > > --
> >> > > > _____________________________________________________________
> >> > > > The information contained in this communication is intended solely
> >> for
> >> > > the
> >> > > > use of the individual or entity to whom it is addressed and others
> >> > > > authorized to receive it. It may contain confidential or legally
> >> > > privileged
> >> > > > information. If you are not the intended recipient you are hereby
> >> > > notified
> >> > > > that any disclosure, copying, distribution or taking any action
in
> >> > > reliance
> >> > > > on the contents of this information is strictly prohibited and
> may be
> >> > > > unlawful. If you have received this communication in error, please
> >> > notify
> >> > > > us immediately by responding to this email and then delete it
from
> >> your
> >> > > > system. The firm is neither liable for the proper and complete
> >> > > transmission
> >> > > > of the information contained in this communication nor for any
> delay
> >> in
> >> > > its
> >> > > > receipt.
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Regards,
> >> > Venkatesh
> >> >
> >> > http://in.linkedin.com/in/seetharamvenkatesh
> >> > http://about.me/SeetharamVenkatesh
> >> >
> >> > “Perfection (in design) is achieved not when there is nothing more to
> >> add,
> >> > but rather when there is nothing more to take away.”
> >> > - Antoine de Saint-Exupéry
> >> >
> >>
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message