Return-Path: X-Original-To: apmail-incubator-general-archive@www.apache.org Delivered-To: apmail-incubator-general-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CE30EEC71 for ; Tue, 12 Feb 2013 09:05:39 +0000 (UTC) Received: (qmail 44374 invoked by uid 500); 12 Feb 2013 09:05:29 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 44120 invoked by uid 500); 12 Feb 2013 09:05:29 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 41661 invoked by uid 99); 12 Feb 2013 09:04:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Feb 2013 09:04:55 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of akarasulu@gmail.com designates 209.85.212.174 as permitted sender) Received: from [209.85.212.174] (HELO mail-wi0-f174.google.com) (209.85.212.174) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Feb 2013 09:04:48 +0000 Received: by mail-wi0-f174.google.com with SMTP id hi8so4116871wib.7 for ; Tue, 12 Feb 2013 01:04:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=YFdqWrjFSlf6Df2wiS+aw+AunEyW8IFXAkIQDx5lgFo=; b=Rlp5SD7/GrWfhqZhOCAwEqxWHooICTuqvmduKGZQMmuhC+PQsloFAbs42J6bZghYi6 2qzj80DQi3KUQbyvCWajsNZvy+uDp5DsVdkvmn8y4OWKWtisPa5XaGGbOluEGH6iQ2y4 oPqW+sgnAt6ekPgB26NErlyXQszlSaVSO08BiHJgMl1k9y+lIrT0gzhmZKh52hXM4FPq waHCDz/IWoACP2VOeBjR3kkcgwyv9ByUHy7ndrufcZL8o6dK/XRiZ4lvvAyJuA0Qkkcg EIqZHmEehw+Rn/A4d8vOXuWeJEthOAy45LsotCoDQKMD4FmIkPDCezLATqrbFoqotcSL s+Zw== MIME-Version: 1.0 X-Received: by 10.180.92.129 with SMTP id cm1mr1597012wib.10.1360659867893; Tue, 12 Feb 2013 01:04:27 -0800 (PST) Sender: akarasulu@gmail.com Received: by 10.194.34.41 with HTTP; Tue, 12 Feb 2013 01:04:27 -0800 (PST) In-Reply-To: References: <5119065F.1040109@hortonworks.com> <51192A4B.8050706@hortonworks.com> Date: Tue, 12 Feb 2013 11:04:27 +0200 X-Google-Sender-Auth: UUO78qzgBPRISAcWgdAzn9jXYMI Message-ID: Subject: Re: [PROPOSAL] Knox Hadoop Gateway Project From: Alex Karasulu To: general@incubator.apache.org Content-Type: multipart/alternative; boundary=f46d043c7d141600e304d583502a X-Virus-Checked: Checked by ClamAV on apache.org --f46d043c7d141600e304d583502a Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I thought about this a bit last night. If y'all are interested I too could also mentor the project. That should add some diversity to the mentors list. I see value in it and would like to see this community succeed. I'm not affiliated with any company. On Mon, Feb 11, 2013 at 9:23 PM, Eric Sammer wrote: > Kevin: > > Makes complete sense. > > I'd like to offer to join the project, if it's accepted for incubation. I= 'm > a committer on MRUnit and Flume, and on the PMC for both. I've helped bot= h > projects through the incubation phase, and I also know a little bit about > this Hadoop thing. ;) > > Thanks! > > > On Mon, Feb 11, 2013 at 9:28 AM, Kevin Minder > wrote: > > > Hi Eric, > > Let me answer your second question first. > > > > Q: Is it your intention to provide job submissions and data ingestion > APIs > > for MR and HDFS, respectively? > > A: Yes we plan to progress the project to cover all existing ecosystem > > projects. In addition the project is based on a modular framework that > > allows for each extension to cover services that are either new or > > proprietary. Certainly there exist very high volume data ingest use > cases > > for which using a gateway may be impractical but in general the idea is > to > > support all required client interaction with Hadoop via the gateway. > > > > Now for your first question... > > > > Q: Can you explain a bit more about what the target use case is? > > A: One typical use case will be that the gateway will run in a DMW. It > > will as you say be integrations with various directory services and is > > extensible to cover those not included. The gateway will then propagat= e > > the identity into the Hadoop cluster using Hadoop specific mechanisms. > The > > key point is that there will typically be a single port open on the > client > > side to the gateway. The Hadoop cluster is firewalled, only providing > > access to the Hadoop services to the gateway instances. > > A: Another use case is that an organization is already using some SSO > > solution and the gateway would be integrated with that to verify any SS= O > > token and then propagate the identity to the Hadoop services. > > > > I will collect this and add it to the proposal wiki once I have privs t= o > > create the page. > > > > Thanks! > > Kevin. > > > > > > On 2/11/13 12:03 PM, Eric Sammer wrote: > > > >> Kevin: > >> > >> Interesting proposal. Can you explain a bit more about what the target > use > >> case is? It sounds like there's SSO-ish functionality (presumably a > doAs() > >> machine) with integration with directory services, but the proposal al= so > >> mentions a single point for "data and jobs." Is it your intention to > >> provide job submissions and data ingestion APIs for MR and HDFS, > >> respectively? Do you plan to target other ecosystem projects such as > >> HBase? > >> Sorry if I missed this in the proposal. > >> > >> Thanks! > >> > >> > >> On Mon, Feb 11, 2013 at 6:55 AM, Kevin Minder > >> **wrote: > >> > >> Knox Gateway Proposal > >>> > >>> =3D=3D Abstract =3D=3D > >>> > >>> Knox Gateway is a system that provides a single point of secure acces= s > >>> for > >>> Apache Hadoop clusters. > >>> > >>> =3D=3D Proposal =3D=3D > >>> > >>> The Knox Gateway (=93Gateway=94 or =93Knox=94) is a system that provi= des a > single > >>> point of authentication and access for Apache Hadoop services in a > >>> cluster. > >>> The goal is to simplify Hadoop security for both users (i.e. who acce= ss > >>> the > >>> cluster data and execute jobs) and operators (i.e. who control access > and > >>> manage the cluster). The Gateway runs as a server (or cluster of > servers) > >>> that serve one or more Hadoop clusters. > >>> > >>> Provide perimeter security to make Hadoop security setup easier > >>> Support authentication and token verification security scenarios > >>> Deliver users a single cluster end-point that aggregates capabilities > for > >>> data and jobs > >>> Enable integration with enterprise and cloud identity management > >>> environments > >>> > >>> =3D=3D Background =3D=3D > >>> > >>> An Apache Hadoop cluster is presented to consumers as a loose > collection > >>> of independent services. This makes it difficult for users to interac= t > >>> with > >>> Hadoop since each service maintains it=92s own method of access and > >>> security. > >>> As well, for operators, configuration and administration of a secure > >>> Hadoop > >>> cluster is a complex and many Hadoop clusters are insecure as a resul= t. > >>> > >>> =3D=3D Rationale =3D=3D > >>> > >>> Organizations that are struggling with Hadoop cluster security result > in > >>> a) running Hadoop without security or b) slowing adoption of Hadoop. > The > >>> Gateway aims to provide perimeter security that integrates more easil= y > >>> into > >>> existing organizations=92 security infrastructure. Doing so will simp= lify > >>> security for these organizations and benefit all Hadoop stakeholders > >>> (i.e. > >>> users and operators). Additionally, making a dedicated perimeter > security > >>> project part of the Apache Hadoop ecosystem will prevent fragmentatio= n > in > >>> this area and further increase the value of Hadoop as a data platform= . > >>> > >>> =3D=3D Current Status =3D=3D > >>> > >>> Prototype available, developed by the list of initial committers. > >>> > >>> =3D=3D=3D Meritocracy =3D=3D=3D > >>> > >>> We desire to build a diverse developer community around Gateway > following > >>> the Apache Way. We want to make the project open source and will > >>> encourage > >>> contributors from multiple organizations following the Apache > meritocracy > >>> model. > >>> > >>> =3D=3D=3D Community =3D=3D=3D > >>> > >>> We hope to extend the user and developer base in the future and build= a > >>> solid open source community around Gateway. Apache Hadoop has a large > >>> ecosystem of open source projects, each with a strong community of > >>> contributors. All project communities in this ecosystem have an > >>> opportunity > >>> to participate in the advancement of the Gateway project because > >>> ultimately, Gateway will enable the security capabilities of their > >>> project > >>> to be more enterprise friendly. > >>> > >>> =3D=3D=3D Core Developers =3D=3D=3D > >>> > >>> Gateway is currently being developed by several engineers from > >>> Hortonworks > >>> - Kevin Minder, Larry McCay, John Speidel, Tom Beerbower and Sumit > >>> Mohanty. > >>> All the engineers have deep expertise in middleware, security & > identity > >>> systems and are quite familiar with the Hadoop ecosystem. > >>> > >>> =3D=3D=3D Alignment =3D=3D=3D > >>> > >>> The ASF is a natural host for Gateway given that it is already the ho= me > >>> of > >>> Hadoop, Hive, Pig, HBase, Oozie and other emerging big data software > >>> projects. Gateway is designed to solve the security challenges famili= ar > >>> to > >>> the Hadoop ecosystem family of projects. > >>> > >>> =3D=3D Known Risks =3D=3D > >>> > >>> =3D=3D=3D Orphaned products & Reliance on Salaried Developers =3D=3D= =3D > >>> > >>> The core developers plan to work full time on the project. We believe > >>> that > >>> this project will be of general interest to many Hadoop users and wil= l > >>> attract a diverse set of contributors. We intend to demonstrate this = by > >>> having contributors from several organizations recognized as committe= rs > >>> by > >>> the time Knox graduates from incubation. > >>> > >>> =3D=3D=3D Inexperience with Open Source =3D=3D=3D > >>> > >>> All of the core developers are active users and followers of open > source. > >>> As well, Hortonworks has a strong heritage of success with > contributions > >>> to > >>> Apache Hadoop Projects. > >>> > >>> =3D=3D=3D Homogeneous Developers =3D=3D=3D > >>> > >>> The current core developers are from Hortonworks, however, we hope to > >>> establish a developer community that includes contributors from sever= al > >>> corporations. > >>> > >>> =3D=3D=3D Reliance on Salaried Developers =3D=3D=3D > >>> > >>> Currently, the developers are paid to do work on Gateway. However, on= ce > >>> the project has a community built around it, we expect to get > committers > >>> and developers from outside the current core developers. > >>> > >>> =3D=3D=3D Relationships with Other Apache Products =3D=3D=3D > >>> > >>> Gateway is going to be used by the users and operators of Hadoop, and > the > >>> Hadoop ecosystem in general. > >>> > >>> =3D=3D=3D A Excessive Fascination with the Apache Brand =3D=3D=3D > >>> > >>> Our interest in developing Gateway in Apache project is to follow an > >>> established development model, as well since many of the Hadoop > ecosystem > >>> projects also are part of Apache, Gateway will complement those > projects > >>> by > >>> following the same development and contribution model. > >>> > >>> =3D=3D Documentation =3D=3D > >>> > >>> There is documentation in Hortonworks=92 internal repositories. These= can > >>> be > >>> shared upon request and will be transferred into the Apache CM system > if > >>> this proposal is accepted. > >>> > >>> =3D=3D Initial Source =3D=3D > >>> > >>> The source is currently in Hortonworks=92 internal repositories. The > >>> process > >>> of making this GitHub repository public has been started and the URL > will > >>> be provided once available. > >>> > >>> =3D=3D Source and Intellectual Property Submission Plan =3D=3D > >>> > >>> The complete Gateway code is under Apache Software License 2. > >>> > >>> =3D=3D External Dependencies =3D=3D > >>> > >>> The Gateway dependencies are listed below, separated by Category A an= d > >>> Category B as defined in the Apache Third-Party Licensing Policy. Not= e: > >>> These are the direct dependencies. Indirect dependencies are not > >>> included. > >>> > >>> =3D=3D=3D Category A Dependencies =3D=3D=3D > >>> > >>> Apache Commons - ASLv2.0 > >>> commons-io:commons-io#2.4 > >>> commons-cli:commons-cli#1.2 > >>> commons-codec:commons-codec#1.****7 > >>> org.apache.commons:commons-****digester3#3.2 > >>> org.apache.commons:commons-****vfs2#2.0 > >>> Apache Hadoop - ASLv2.0 > >>> org.apache.hadoop:hadoop-auth#****0.23.3 > >>> org.apache.hadoop:hadoop-core#****1.0.3 > >>> Apache Geronimo - ASLv2.0 > >>> org.apache.geronimo.****components:geronimo-jaspi#2.0.****0 > >>> org.apache.geronimo.specs:****geronimo-osgi-locator#1.1 > >>> Apache Shiro - ASLv2.0 > >>> org.apache.shiro:shiro-web#1.****2.1 > >>> ApacheDS - ASLv2.0 > >>> org.apache.directory.server:****apacheds-all#1.5.5 > >>> > >>> Log4J - ASLv2.0 > >>> log4j:log4j#1.2.17 > >>> SL4J - MIT > >>> org.slf4j:slf4j-api#1.6.6 > >>> org.slf4j:slf4j-log4j12#1.6.6 > >>> Guava - ASLv2.0 > >>> com.google.guava:guava#14.0-****rc1 > >>> HttpClient - ASLv2.0 > >>> org.apache.httpcomponents:****httpclient#4.2.1 > >>> Jetty - ASLv2.0 > >>> org.eclipse.jetty:jetty-****server#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-****servlet#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-****webapp#8.1.7.v20120910 > >>> org.eclipse.jetty:jetty-jaspi#****8.1.7.v20120910 > >>> org.eclipse.jetty.aggregate:****jetty-all#8.1.7.v20120910 > >>> org.eclipse.jetty:test-jetty-****servlet#8.1.7.v20120910 > >>> Spring Security - ASLv2.0 > >>> org.springframework:spring-****core#3.1.3.RELEASE > >>> org.springframework:spring-****context#3.1.3.RELEASE > >>> org.springframework:spring-****web#3.1.3.RELEASE > >>> org.springframework.security:****spring-security-core#3.1.3.****RELEA= SE > >>> org.springframework.security:****spring-security-web#3.1.3.****RELEAS= E > >>> org.springframework.security:****spring-security-config#3.1.3.** > >>> **RELEASE > >>> org.springframework.security:****spring-security-ldap#3.1.2.****RELEA= SE > >>> org.springframework.ldap:****spring-ldap-core#1.3.1.RELEASE > >>> org.springframework.ldap:****spring-ldap-core-tiger#1.3.1.****RELEASE > >>> org.springframework.ldap:****spring-ldap-odm#1.3.1.RELEASE > >>> org.springframework.ldap:****spring-ldap-ldif-core#1.3.1.****RELEASE > >>> org.springframework.ldap:****spring-ldap-ldif-batch#1.3.1.****RELEASE > >>> JBoss ShrinkWrap - ASLv2.0 > >>> org.jboss.shrinkwrap:****shrinkwrap-api#1.0.1 > >>> org.jboss.shrinkwrap:****shrinkwrap-impl-base#1.0.1 > >>> org.jboss.shrinkwrap.****descriptors:shrinkwrap-** > >>> descriptors-api-javaee#2.0.0-****alpha-4 > >>> org.jboss.shrinkwrap.****descriptors:shrinkwrap-** > >>> descriptors-impl-javaee#2.0.0-****alpha-4 > >>> > >>> > >>> =3D=3D=3D Category A Dependencies (Test) =3D=3D=3D > >>> > >>> EasyMock - ASLv2.0 > >>> org.easymock:easymock#3.0 > >>> XML Matchers - ASLv2.0 > >>> org.xmlmatchers:xml-matchers#****0.10 > >>> > >>> Hamcrest - BSDv3 > >>> org.hamcrest:hamcrest-api#1.0 > >>> org.hamcrest:hamcrest-core#1.****2.1 > >>> org.hamcrest:hamcrest-library#****1.2.1 > >>> JsonPath - ASLv2.0 > >>> com.jayway.jsonpath:json-path#****0.8.1 > >>> com.jayway.jsonpath:json-path-****assert#0.8.1 > >>> > >>> XMLTool - ASLv2.0 > >>> com.mycila.xmltool:xmltool#3.3 > >>> REST-assured - ASLv2.0 > >>> com.jayway.restassured:rest-****assured#1.6.2 > >>> > >>> > >>> =3D=3D=3D Category B Dependencies =3D=3D=3D > >>> > >>> Jersey - CDDLv1.1 or GPL2wCPE > >>> com.sun.jersey:jersey-server#****1.14 > >>> com.sun.jersey:jersey-servlet#****1.14 > >>> Jerico - EPLv1.0 > >>> net.htmlparser.jericho:****jericho-html#3.2 > >>> > >>> Servlet - CDDLv1.0 or GPLv2 > >>> javax.servlet:javax.servlet-****api#3.0.1 > >>> > >>> JUnit - CPLv1.0 > >>> junit:junit#4.11 > >>> > >>> =3D=3D Cryptography =3D=3D > >>> > >>> The Gateway uses cryptographic software indirectly as a result of > having > >>> two dependencies: ApacheDS and Apache Shiro. Gateway does not include > any > >>> special or custom cryptographic technologies. > >>> > >>> ApacheDS is an ASF project and has been classified Export Commodity > >>> Control Number (ECCN) 5D002.C.1 due to it=92s dependency on Bouncy > Castle. > >>> More information on the ApacheDS classification can be found at > >>> http://svn.apache.org/repos/****asf/directory/apacheds/trunk/****< > http://svn.apache.org/repos/**asf/directory/apacheds/trunk/**> > >>> installers/README >>> directory/apacheds/trunk/**installers/README< > http://svn.apache.org/repos/asf/directory/apacheds/trunk/installers/READM= E > > > >>> > > >>> > >>> > >>> Apache Shiro is an ASF project and has been classified Export Commodi= ty > >>> Control Number (ECCN) 5D002.C.1. More information on the Apache Shiro > >>> classification can be found at http://svn.apache.org/repos/** > >>> asf/shiro/trunk/README >>> shiro/trunk/README >> > >>> > >>> > >>> =3D=3D Required Resources =3D=3D > >>> > >>> =3D=3D=3D Mailing lists =3D=3D=3D > >>> > >>> knox-dev AT incubator DOT apache DOT org > >>> knox-commits AT incubator DOT apache DOT org > >>> knox-user AT hms incubator apache DOT org > >>> knox-private AT incubator DOT apache DOT org > >>> > >>> =3D=3D=3D Subversion Directory =3D=3D=3D > >>> > >>> https://svn.apache.org/repos/****asf/incubator/knox< > https://svn.apache.org/repos/**asf/incubator/knox> > >>> https://svn.apache.org/repos/asf/incubator/knox> > >>> > > >>> > >>> > >>> =3D=3D=3D Issue Tracking =3D=3D=3D > >>> > >>> JIRA Knox (KNOX) > >>> > >>> =3D=3D Initial Committers =3D=3D > >>> > >>> Kevin Minder (kevin DOT minder AT hortonworks DOT com) > >>> Larry McCay (lmccay AT hortonworks DOT com) > >>> John Speidel (jspeidel AT hortonworks DOT com) > >>> Tom Beerbower (tbeerbower AT hortonworks DOT com) > >>> Sumit Mohanty (smohanty AT hortonworks DOT com) > >>> > >>> =3D=3D Affiliations =3D=3D > >>> > >>> Kevin Minder (Hortonworks) > >>> Larry McCay (Hortonworks) > >>> John Speidel (Hortonworks) > >>> Tom Beerbower (Hortonworks) > >>> Sumit Mohanty (Hortonworks) > >>> > >>> =3D=3D Sponsors =3D=3D > >>> > >>> =3D=3D=3D Champion =3D=3D=3D > >>> > >>> Devaraj Das (ddas AT apache DOT org) > >>> > >>> =3D=3D=3D Nominated Mentors =3D=3D=3D > >>> > >>> Owen O=92Malley (omalley AT apache DOT org) > >>> Mahadev Konar (mahadev AT apache DOT org) > >>> Alan Gates (gates AT apache DOT org) > >>> Devaraj Das (ddas AT apache DOT org) > >>> > >>> =3D=3D=3D Sponsoring Entity =3D=3D=3D > >>> > >>> Incubator PMC > >>> > >>> ------------------------------****----------------------------** > >>> --**--------- > >>> To unsubscribe, e-mail: general-unsubscribe@incubator.****apache.org< > >>> general-**unsubscribe@incubator.apache.**org< > general-unsubscribe@incubator.apache.org> > >>> > > >>> For additional commands, e-mail: general-help@incubator.apache. > ****org< > >>> general-help@incubator.**apache.org >> > >>> > >>> > >>> > >> > > > > ------------------------------**------------------------------**-------= -- > > To unsubscribe, e-mail: general-unsubscribe@incubator.**apache.org< > general-unsubscribe@incubator.apache.org> > > For additional commands, e-mail: general-help@incubator.apache.**org< > general-help@incubator.apache.org> > > > > > > > -- > Eric Sammer > twitter: esammer > data: www.cloudera.com > --=20 Best Regards, -- Alex --f46d043c7d141600e304d583502a--