Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B3B889536 for ; Thu, 14 Mar 2013 00:56:54 +0000 (UTC) Received: (qmail 33400 invoked by uid 500); 14 Mar 2013 00:56:54 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 33080 invoked by uid 500); 14 Mar 2013 00:56:53 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 33066 invoked by uid 99); 14 Mar 2013 00:56:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 00:56:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vinodkv@hortonworks.com designates 209.85.160.44 as permitted sender) Received: from [209.85.160.44] (HELO mail-pb0-f44.google.com) (209.85.160.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 00:56:46 +0000 Received: by mail-pb0-f44.google.com with SMTP id wz12so1554192pbc.31 for ; Wed, 13 Mar 2013 17:56:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:mime-version:content-type:subject:date:in-reply-to :to:references:message-id:x-mailer:x-gm-message-state; bh=c5yhyUIevPSeeHGlxVwq9PZ8RO9+BeOgOeLM5/0+ll8=; b=SwmL6uKr0vaZ+zVPw3NkGdLvL1O/dti41tHPGw3U/dtny1IMA0ukq99vRGP3PrQ7X9 JnzBR4HY5TzaEN7MFsFo0Yn34CYaONgYVoObdgCznoJvspVn41G9Z/OSKHBLorcWC5D+ kGIS/yXmSm7wtZFj1CwjAo4xEfYCdkut5HcK5c0D+mbsJZPn3A3YZwhhf94dnLlSPx9E CcRQma3lypmJ84k6X0MCQ7IS/n37TlqRYsYJTz7bo7CYN6SdrrZx4qcB4n5jeJDK9MDc NC4uarIP8S8b55eWwUr7s9No+Zpgj/6bNDbi0n+Ot/NKfrRy0WPRCQeQAmcrMMymevnC jyNg== X-Received: by 10.68.9.67 with SMTP id x3mr868257pba.219.1363222585421; Wed, 13 Mar 2013 17:56:25 -0700 (PDT) Received: from [10.11.3.16] (host1.hortonworks.com. [70.35.59.2]) by mx.google.com with ESMTPS id ri1sm620256pbc.16.2013.03.13.17.56.23 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 13 Mar 2013 17:56:24 -0700 (PDT) From: Vinod Kumar Vavilapalli Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: multipart/alternative; boundary="Apple-Mail=_67E85A0C-FE54-403D-8A61-34E82A7DF410" Subject: Re: [PROPOSAL] Ivory - Hadoop data management and processing platform Date: Wed, 13 Mar 2013 17:56:22 -0700 In-Reply-To: To: general@incubator.apache.org References: Message-Id: <52039F13-8184-4056-A51A-EE38D744E8ED@apache.org> X-Mailer: Apple Mail (2.1283) X-Gm-Message-State: ALoCoQmis+7wMRzG/JwrUHnbd+UsHyDTQKWFok9ko8+Pf9RjSTlYlSzGm7mlOu/ljHbyOIYYBzGo X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_67E85A0C-FE54-403D-8A61-34E82A7DF410 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 +1, this will be a great addition to the Hadoop eco-system! The proposal looks fine overall. I quickly searched around for the name = ivory, it looks to be a safe one, but someone needs to do due diligence? And I think you can chose to have git as the version control if you feel = like it. Thanks, +Vinod Kumar Vavilapalli On Mar 13, 2013, at 10:00 AM, Srikanth Sundarrajan wrote: > =3D Ivory Proposal =3D >=20 > =3D=3D Abstract =3D=3D > Ivory is a data processing and management solution for Hadoop designed = for > data motion, coordination of data pipelines, lifecycle management, and > data discovery. Ivory enables end consumers to quickly onboard their = data > and its associated processing and management tasks on Hadoop clusters. >=20 > =3D=3D Proposal =3D=3D > Ivory will enable easy data management via declarative mechanism for > Hadoop. Users of Ivory platform simply define infrastructure = endpoints, > data sets and processing rules declaratively. These configurations > are expressed in such a way that the dependencies between > these entities are explicitly described. This information about > inter-dependencies between various entities allows Ivory to = orchestrate and > manage various data management functions. >=20 > The key use cases that Ivory addresses are: > * Data Motion > * Process orchestration and scheduling > * Policy-based Lifecycle Management > * Data Discovery > * Operability/Usability >=20 > With these features it is possible for users to onboard their data = sets > with > a comprehensive and holistic understanding of how, when and where = their > data > is managed across its lifecycle. Complex functions such as retrying > failures, > identifying possible SLA breaches or automated handling of input data > changes > are now simple directives. All the administrative functions and user = level > functions are available via RESTful APIs. CLI is simply a wrapper over = the > RESTful APIs. >=20 > =3D=3D Background =3D=3D > Hadoop and its ecosystem of products have made storing and processing > massive > amounts of data commonplace. This has enabled numerous organizations = to > gain > valuable insights that they never could have achieved in the past. = While it > is easy to leverage Hadoop for crunching large volumes of data, = organizing > data, managing life cycle of data and processing data is fairly = involved. > This is solved adequately well in a classic data platform involving = data > warehouses and standard ETL (extract-transform-load) tools, but = remains > largely > unsolved today. In addition to data processing complexities, Hadoop > presents > new sets of challenges and opportunities relating to management of = data. >=20 > Data Management on Hadoop encompasses data motion, process = orchestration, > lifecycle management, data discovery, etc. among other concerns that = are > beyond > ETL. Ivory is a new data processing and management platform for Hadoop = that > solves this problem and creates additional opportunities by building = on > existing > components within the Hadoop ecosystem (ex. Apache Oozie, Apache = Hadoop > DistCp > etc.) without reinventing the wheel. Ivory has been in production at > InMobi, > going on its second year and has been managing hundreds of feeds and > processes. >=20 > Ivory is being developed by engineers employed with InMobi, = Hortonworks and > Yahoo!. This platform addition will increase the adoption of Apache = Hadoop > by > driving data management tractable for end users. We are therefore = proposing > to > make Ivory an Apache open source project. >=20 > =3D=3D Rationale =3D=3D > The Ivory project aims to improve the usability of Apache Hadoop. As a > result > Apache Hadoop will grow its community of users by increasing the = places > Hadoop > can be utilized and the use cases it will solve. By developing Ivory = in > Apache > we hope to gather a diverse community of contributors, helping to = ensure > that > Ivory is deployable for a broad range of scenarios. Members of the = Hadoop > development community will be able to influence Ivory=92s roadmap, and > contribute > to it. We believe having Ivory as part of the Apache Hadoop ecosystem = will > be > a great benefit to all of Hadoop's users. >=20 > =3D=3D Current Status =3D=3D > Ivory is widely deployed in production within InMobi and moving on to = its > second year. A version with a valuable set of features is developed by = the > list of initial committers and is hosted on github. >=20 > =3D=3D=3D Meritocracy =3D=3D=3D > Our intent with this incubator proposal is to start building a diverse > developer > community around Ivory following the Apache meritocracy model. We have > wanted to > make the project open source and encourage contributors from multiple > organizations from the start. We plan to provide plenty of support to = new > developers and to quickly recruit those who make solid contributions = to > committer status. >=20 > =3D=3D=3D Community =3D=3D=3D > We are happy to report that the initial team already represents = multiple > organizations. We hope to extend the user and developer base further = in the > future and build a solid open source community around Ivory. >=20 > =3D=3D=3D Core Developers =3D=3D=3D > Ivory is currently being developed by three engineers from InMobi =96 > Srikanth Sundarrajan, Shwetha G S, and Shaik Idris, two Hortonworks > employees =96 > Sanjay Radia and Venkatesh Seetharam. In addition, two Yahoo! = employees, > Rohini Palaniswamy and Thiruvel Thirumoolan, are also involved. = Srikanth, > Shwetha and Shaik are the original developers. All the engineers have = built > two generations of Data Management on Hadoop, having deep expertise in > Hadoop > and are quite familiar with the Hadoop Ecosystem. >=20 > =3D=3D=3D Alignment =3D=3D=3D > The ASF is a natural host for Ivory given that it is already the home = of > Hadoop, > Pig, Knox, HCatalog, and other emerging =93big data=94 software = projects. Ivory > has > been designed to solve the data management challenges and = opportunities of > the > Hadoop ecosystem family of products. Ivory fills the gap that Hadoop > ecosystem > has been lacking in the areas of data processing and data lifecycle > management. >=20 > =3D=3D Known Risks =3D=3D >=20 > =3D=3D=3D Orphaned products & Reliance on Salaried Developers =3D=3D=3D > The core developers plan to work full time on the project. There is = very > little > risk of Ivory getting orphaned. Ivory is in use by companies we work = for so > the > companies have an interest in its continued vitality. >=20 > =3D=3D=3D Inexperience with Open Source =3D=3D=3D > All of the core developers are active users and followers of open = source. > Srikanth Sundarrajan has been contributing patches to Apache Hadoop = and > Apache > Oozie, Shwetha GS has been contributing patches to Apache Oozie. > Seetharam Venkatesh is a committer on Apache Knox. Rohini Palaniswamy = is a > committer on Apache PIG. Sharad Agarwal, Amareshwari SR (also a Apache = Hive > PMC member) and Sanjay Radia are PMC members on Apache Hadoop. >=20 > =3D=3D=3D Homogeneous Developers =3D=3D=3D > The current core developers are from diverse set of organizations such = as > InMobi, Hortonworks, and, Yahoo!. We expect to quickly establish a > developer > community that includes contributors from several corporations post > incubation. >=20 > =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > Currently, most developers are paid to do work on Ivory but few are > contributing > in their spare time. However, once the project has a community built = around > it > post incubation, we expect to get committers and developers from = outside > the > current core developers. >=20 > =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D > Ivory is going to be used by the users of Hadoop and the Hadoop = ecosystem > in > general. >=20 > =3D=3D=3D A Excessive Fascination with the Apache Brand =3D=3D=3D > While we respect the reputation of the Apache brand and have no doubts = that > it > will attract contributors and users, our interest is primarily to give > Ivory a > solid home as an open source project following an established = development > model. > We have also given reasons in the Rationale and Alignment sections. >=20 > =3D=3D Documentation =3D=3D > There is documentation in github repository at: > https://github.com/sriksun/Ivory >=20 > =3D=3D Initial Source =3D=3D > The source is currently in github repository at: > https://github.com/sriksun/Ivory >=20 > =3D=3D Source and Intellectual Property Submission Plan =3D=3D > The complete Ivory code is under Apache Software License 2. >=20 > =3D=3D External Dependencies =3D=3D > The dependencies all have Apache compatible licenses. These include = BSD, > MIT licensed dependencies. >=20 > =3D=3D Cryptography =3D=3D > None >=20 > =3D=3D Required Resources =3D=3D >=20 > =3D=3D=3D Mailing lists =3D=3D=3D >=20 > * ivory-dev AT incubator DOT apache DOT org > * ivory-commits AT incubator DOT apache DOT org > * ivory-user AT incubator apache DOT org > * ivory-private AT incubator DOT apache DOT org >=20 > =3D=3D=3D Subversion Directory =3D=3D=3D > https://svn.apache.org/repos/asf/incubator/ivory >=20 > =3D=3D=3D Issue Tracking =3D=3D=3D > JIRA IVORY >=20 > =3D=3D Initial Committers =3D=3D > * Srikanth Sundarrajan (Srikanth.Sundarrajan AT inmobi DOT com) > * Shwetha GS (shwetha.gs AT inmobi DOT com) > * Shaik Idris (shaik.idris AT inmobi DOT com) > * Venkatesh Seetharam (Venkatesh AT apache DOT com) > * Rohini Palaniswamy (rohinip AT yahoo-inc DOT com) > * Thiruvel Thirumoolan (thiruvel AT yahoo-inc DOT com) > * Sanjay Radia (sanjay AT apache DOT org) > * Sharad Agarwal (sharad AT apache DOT org) > * Amareshwari SR (amareshwari AT apache DOT org) >=20 > =3D=3D Affiliations =3D=3D > * Srikanth Sundarrajan (InMobi) > * Shwetha GS (InMobi) > * Shaik Idris (InMobi) > * Venkatesh Seetharam (Hortonworks Inc) > * Rohini Palaniswamy (Yahoo! Inc) > * Thiruvel Thirumoolan (Yahoo! Inc) > * Sanjay Radia (Hortonworks Inc) > * Sharad Agarwal (InMobi) > * Amareshwari SR (InMobi) >=20 > =3D=3D Sponsors =3D=3D >=20 > =3D=3D=3D Champion =3D=3D=3D > * Arun C Murthy (acmurthy at apache dot org) >=20 > =3D=3D=3D Nominated Mentors =3D=3D=3D > * Alan Gates (gates AT apache DOT org) > * Chris Douglas (cdouglas AT apache DOT org) > * Devaraj Das (ddas AT apache DOT org) > * Owen O=92Malley (omalley AT apache DOT org) >=20 > =3D=3D=3D Sponsoring Entity =3D=3D=3D > Incubator PMC >=20 > --=20 > _____________________________________________________________ > The information contained in this communication is intended solely for = the=20 > use of the individual or entity to whom it is addressed and others=20 > authorized to receive it. It may contain confidential or legally = privileged=20 > information. If you are not the intended recipient you are hereby = notified=20 > that any disclosure, copying, distribution or taking any action in = reliance=20 > on the contents of this information is strictly prohibited and may be=20= > unlawful. If you have received this communication in error, please = notify=20 > us immediately by responding to this email and then delete it from = your=20 > system. The firm is neither liable for the proper and complete = transmission=20 > of the information contained in this communication nor for any delay = in its=20 > receipt. --Apple-Mail=_67E85A0C-FE54-403D-8A61-34E82A7DF410--