incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Ryaboy <dvrya...@gmail.com>
Subject Re: [PROPOSAL] Ivory - Hadoop data management and processing platform
Date Sat, 16 Mar 2013 00:15:40 GMT
It would be awfully nice of you not to stomp on another hadoop ecosystem
project's google-fu when your project becomes very successful and admired
across the hadoopverse :)

Ivory isn't a fly-by-night project someone threw up on github -- it's
generated over a dozen peer-reviewed papers, and has many watchers and dev
forks.

I don't have a vote here, but I'd say that yes, this will lead to confusion
when people look for hadoop ivory.

D


On Fri, Mar 15, 2013 at 11:09 AM, Seetharam Venkatesh <
venkatesh@innerzeal.com> wrote:

> Hi Henry,
>
> Is there a concern with the current name? The closest is a tool for
> Information Retrieval. Not sure if there is an overlap.  We will also bring
> this up with the champion and mentors to see if this needs to be vet with
> trademarks folks as well.
>
> Your suggestions are welcome.
>
> Thanks!
>
>
> On Fri, Mar 15, 2013 at 10:18 AM, Henry Saputra <henry.saputra@gmail.com
> >wrote:
>
> > HI Srikanth,
> >
> > So does the Ivory name stay or once the podling near graduation it will
> try
> > to find another name?
> >
> > - Henry
> >
> >
> > On Fri, Mar 15, 2013 at 12:34 AM, Srikanth Sundarrajan <
> > srikanth.sundarrajan@inmobi.com> wrote:
> >
> > > Made few edits to the proposal (
> > > http://wiki.apache.org/incubator/IvoryProposal) as per the feedback
> > > received so far.
> > >
> > > Regards
> > > Srikanth Sundarrajan
> > >
> > > = Ivory Proposal =
> > >
> > > == Abstract ==
> > > Ivory is a data processing and management solution for Hadoop designed
> > > for data motion, coordination of data pipelines, lifecycle management,
> > > and data discovery. Ivory enables end consumers to quickly onboard
> > > their data and its associated processing and management tasks on
> > > Hadoop clusters.
> > >
> > > == Proposal ==
> > > Ivory will enable easy data management via declarative mechanism for
> > > Hadoop. Users of Ivory platform simply define infrastructure
> > > endpoints, data sets and processing rules declaratively. These
> > > declarative configurations are expressed in such a way that the
> > > dependencies between these configured entities are explicitly
> > > described. This information about inter-dependencies between various
> > > entities allows Ivory to orchestrate and manage various data
> > > management functions.
> > >
> > > The key use cases that Ivory addresses are:
> > >  * Data Motion
> > >  * Process orchestration and scheduling
> > >  * Policy-based Lifecycle Management
> > >  * Data Discovery
> > >  * Operability/Usability
> > >
> > > With these features it is possible for users to onboard their data
> > > sets with a comprehensive and holistic understanding of how, when and
> > > where their data is managed across its lifecycle. Complex functions
> > > such as retrying failures, identifying possible SLA breaches or
> > > automated handling of input data changes are now simple directives.
> > > All the administrative functions and user level functions are
> > > available via RESTful APIs. CLI is simply a wrapper over the RESTful
> > > APIs.
> > >
> > > == Background ==
> > > Hadoop and its ecosystem of products have made storing and processing
> > > massive amounts of data commonplace. This has enabled numerous
> > > organizations to gain valuable insights that they never could have
> > > achieved in the past. While it is easy to leverage Hadoop for
> > > crunching large volumes of data, organizing data, managing life cycle
> > > of data and processing data is fairly involved. This is solved
> > > adequately well in a classic data platform involving data warehouses
> > > and standard ETL (extract-transform-load) tools, but remains largely
> > > unsolved today. In addition to data processing complexities, Hadoop
> > > presents new sets of challenges and opportunities relating to
> > > management of data.
> > >
> > > Data Management on Hadoop encompasses data motion, process
> > > orchestration, lifecycle management, data discovery, etc. among other
> > > concerns that are beyond ETL. Ivory is a new data processing and
> > > management platform for Hadoop that solves this problem and creates
> > > additional opportunities by building on existing components within the
> > > Hadoop ecosystem (ex. Apache Oozie, Apache Hadoop DistCp etc.) without
> > > reinventing the wheel. Ivory has been in production at InMobi, going
> > > on its second year and has been managing hundreds of feeds and
> > > processes.
> > >
> > > Ivory is being developed by engineers employed with InMobi and
> > > Hortonworks. This platform addition will increase the adoption of
> > > Apache Hadoop by driving data management tractable for end users. We
> > > are therefore proposing to make Ivory an Apache open source project.
> > >
> > > == Rationale ==
> > > The Ivory project aims to improve the usability of Apache Hadoop. As a
> > > result Apache Hadoop will grow its community of users by increasing
> > > the places Hadoop can be utilized and the use cases it will solve. By
> > > developing Ivory in Apache we hope to gather a diverse community of
> > > contributors, helping to ensure that Ivory is deployable for a broad
> > > range of scenarios. Members of the Hadoop development community will
> > > be able to influence Ivory’s roadmap, and contribute to it. We believe
> > > having Ivory as part of the Apache Hadoop ecosystem will be a great
> > > benefit to all of Hadoop's users.
> > >
> > > == Current Status ==
> > > Ivory is widely deployed in production within InMobi and moving on to
> > > its second year. A version with a valuable set of features is
> > > developed by the list of initial committers and is hosted on github.
> > >
> > > === Meritocracy ===
> > > Our intent with this incubator proposal is to start building a diverse
> > > developer community around Ivory following the Apache meritocracy
> > > model. We have wanted to make the project open source and encourage
> > > contributors from multiple organizations from the start. We plan to
> > > provide plenty of support to new developers and to quickly recruit
> > > those who make solid contributions to committer status.
> > >
> > > === Community ===
> > > We are happy to report that the initial team already represents
> > > multiple organizations. We hope to extend the user and developer base
> > > further in the future and build a solid open source community around
> > > Ivory.
> > >
> > > === Core Developers ===
> > > Ivory is currently being developed by three engineers from InMobi –
> > > Srikanth Sunderrajan, Shwetha G S, and Shaik Idris, two Hortonworks
> > > employees – Sanjay Radia and Venkatesh Seetharam. In addition, Rohini
> > > Palaniswamy and Thiruvel Thirumoolan, were also involved in the
> > > initial design discussions. Srikanth, Shwetha and Shaik are the
> > > original developers. All the engineers have built two generations of
> > > Data Management on Hadoop, having deep expertise in Hadoop and are
> > > quite familiar with the Hadoop Ecosystem. Samarth Gupta & Rishu
> > > Mehrothra, both from InMobi have build the QA automation for Ivory.
> > >
> > > === Alignment ===
> > > The ASF is a natural host for Ivory given that it is already the home
> > > of Hadoop, Pig, Knox, HCatalog, and other emerging “big data” software
> > > projects. Ivory has been designed to solve the data management
> > > challenges and opportunities of the Hadoop ecosystem family of
> > > products. Ivory fills the gap that Hadoop ecosystem has been lacking
> > > in the areas of data processing and data lifecycle management.
> > >
> > > == Known Risks ==
> > >
> > > === Orphaned products & Reliance on Salaried Developers ===
> > > The core developers plan to work full time on the project. There is
> > > very little risk of Ivory getting orphaned. Ivory is in use by
> > > companies we work for so the companies have an interest in its
> > > continued vitality.
> > >
> > > === Inexperience with Open Source ===
> > > All of the core developers are active users and followers of open
> > > source. Srikanth Sundarrajan has been contributing patches to Apache
> > > Hadoop and Apache Oozie, Shwetha GS has been contributing patches to
> > > Apache Oozie.  Seetharam Venkatesh is a committer on Apache Knox.
> > > Sharad Agarwal, Amareshwari SR (also a Apache Hive PMC member) and
> > > Sanjay Radia are PMC members on Apache Hadoop.
> > >
> > > === Homogeneous Developers ===
> > > The current core developers are from diverse set of organizations such
> > > as InMobi and Hortonworks. We expect to quickly establish a developer
> > > community that includes contributors from several corporations post
> > > incubation.
> > >
> > > === Reliance on Salaried Developers ===
> > > Currently, most developers are paid to do work on Ivory but few are
> > > contributing in their spare time. However, once the project has a
> > > community built around it post incubation, we expect to get committers
> > > and developers from outside the current core developers.
> > >
> > > === Relationships with Other Apache Products ===
> > > Ivory is going to be used by the users of Hadoop and the Hadoop
> > > ecosystem in general.
> > >
> > > === A Excessive Fascination with the Apache Brand ===
> > > While we respect the reputation of the Apache brand and have no doubts
> > > that it will attract contributors and users, our interest is primarily
> > > to give Ivory a solid home as an open source project following an
> > > established development model. We have also given reasons in the
> > > Rationale and Alignment sections.
> > >
> > > == Documentation ==http://wiki.apache.org/incubator/IvoryProposal
> > >
> > > == Initial Source ==
> > > The source is currently in github repository at:
> > > https://github.com/sriksun/Ivory
> > >
> > > == Source and Intellectual Property Submission Plan ==
> > > The complete Ivory code is under Apache Software License 2.
> > >
> > > == External Dependencies ==
> > > The dependencies all have Apache compatible licenses. These include
> > > BSD, MIT licensed dependencies.
> > >
> > > == Cryptography ==
> > > None
> > >
> > > == Required Resources ==
> > >
> > > === Mailing lists ===
> > >
> > >  * ivory-dev AT incubator DOT apache DOT org
> > >  * ivory-commits AT incubator DOT apache DOT org
> > >  * ivory-user AT incubator apache DOT org
> > >  * ivory-private AT incubator DOT apache DOT org
> > >
> > > === Subversion Directory ===
> > > Git is the preferred source control system: git://git.apache.org/ivory
> > >
> > > === Issue Tracking ===
> > > JIRA IVORY
> > >
> > > == Initial Committers ==
> > >  * Srikanth Sundarrajan (Srikanth.Sundarrajan AT inmobi DOT com)
> > >  * Shwetha GS (shwetha.gs AT inmobi DOT com)
> > >  * Shaik Idris (shaik.idris AT inmobi DOT com)
> > >  * Venkatesh Seetharam (Venkatesh AT apache DOT org)
> > >  * Sanjay Radia (sanjay AT apache DOT org)
> > >  * Sharad Agarwal (sharad AT apache DOT org)
> > >  * Amareshwari SR (amareshwari AT apache DOT org)
> > >  * Samarth Gupta (samarth.gupta AT inmobi DOT com)
> > >  * Rishu Mehrothra (rishu.mehrothra AT inmobi DOT com)
> > >
> > > == Affiliations ==
> > >  * Srikanth Sundarrajan (InMobi)
> > >  * Shwetha GS (InMobi)
> > >  * Shaik Idris (InMobi)
> > >  * Venkatesh Seetharam (Hortonworks Inc.)
> > >  * Sanjay Radia (Hortonworks Inc.)
> > >  * Sharad Agarwal (InMobi)
> > >  * Amareshwari SR (InMobi)
> > >  * Samarth Gupta (InMobi)
> > >  * Rishu Mehrothra (InMobi)
> > >
> > > == Sponsors ==
> > >
> > > === Champion ===
> > >  * Arun C Murthy (acmurthy at apache dot org)
> > >
> > > === Nominated Mentors ===
> > >  * Alan Gates (gates AT apache DOT org)
> > >  * Chris Douglas (cdouglas AT apache DOT org)
> > >  * Devaraj  Das (ddas AT apache DOT org)
> > >  * Owen O’Malley (omalley AT apache DOT org)
> > >
> > > === Sponsoring Entity ===
> > > Incubator PMC
> > >
> > > --
> > > _____________________________________________________________
> > > The information contained in this communication is intended solely for
> > the
> > > use of the individual or entity to whom it is addressed and others
> > > authorized to receive it. It may contain confidential or legally
> > privileged
> > > information. If you are not the intended recipient you are hereby
> > notified
> > > that any disclosure, copying, distribution or taking any action in
> > reliance
> > > on the contents of this information is strictly prohibited and may be
> > > unlawful. If you have received this communication in error, please
> notify
> > > us immediately by responding to this email and then delete it from your
> > > system. The firm is neither liable for the proper and complete
> > transmission
> > > of the information contained in this communication nor for any delay in
> > its
> > > receipt.
> > >
> >
>
>
>
> --
> Regards,
> Venkatesh
>
> http://in.linkedin.com/in/seetharamvenkatesh
> http://about.me/SeetharamVenkatesh
>
> “Perfection (in design) is achieved not when there is nothing more to add,
> but rather when there is nothing more to take away.”
> - Antoine de Saint-Exupéry
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message