Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F27810775 for ; Tue, 6 Aug 2013 21:48:59 +0000 (UTC) Received: (qmail 91713 invoked by uid 500); 6 Aug 2013 21:48:57 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 91654 invoked by uid 500); 6 Aug 2013 21:48:57 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 91646 invoked by uid 99); 6 Aug 2013 21:48:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 21:48:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lmccay@hortonworks.com designates 209.85.128.50 as permitted sender) Received: from [209.85.128.50] (HELO mail-qe0-f50.google.com) (209.85.128.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 21:48:53 +0000 Received: by mail-qe0-f50.google.com with SMTP id q19so552430qeb.37 for ; Tue, 06 Aug 2013 14:48:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:content-transfer-encoding:message-id:references:to; bh=d6TgPCLiPV36ac7sTxYKOv+ahF4A0GB9F7KKbQKShnQ=; b=F+8zEn0T1Caf4F+yjpn9bihkpIEIlzi1c5ZhMfPc3Y5aNYQ5zB4OaACl+Rpd2cO+ws iLdBBYMJk41p5jq9I/bk1AvisvxULT4piVRDxZxqrf9fCX50mxdxVk+ISjrJF7kr2axn 672UJhMp//dfn10eAeDEMUgU05/Yl8jxh1PeZBDXI+QHW/G+XzUyUZq74/9oKt1SP2PW YRrOOpuQeOnF6dT4Fq5aU1+EELzCrnfVjb2eJD5fyzs7bR3s8I985P5R1q7g73KYfXgY CKPkg5f92QOjuLX8vOF6kLTKXSOE/OKQsuA5U/BqT8BkaJ5QxgA6i/N136fNR2/rYiZ1 GRHQ== X-Gm-Message-State: ALoCoQlxb74f2iOOFsRDJaULSw8qq9dFtf1OIvPDe2jm1cWgIiUOzP6bar6QZ/csK68k5cKydKpp X-Received: by 10.49.2.68 with SMTP id 4mr200415qes.64.1375825711648; Tue, 06 Aug 2013 14:48:31 -0700 (PDT) Received: from new-host-4.home (pool-173-72-7-233.cmdnnj.fios.verizon.net. [173.72.7.233]) by mx.google.com with ESMTPSA id i1sm7112345qas.10.2013.08.06.14.48.29 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 06 Aug 2013 14:48:30 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\)) Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components From: Larry McCay In-Reply-To: Date: Tue, 6 Aug 2013 17:48:28 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <170094A4-7B50-4EAC-8C0F-3C8D2AE28634@hortonworks.com> References: <8D5F7E3237B3ED47B84CF187BB17B6661166A47C@SHSMSX103.ccr.corp.intel.com> <3E748277-8A9F-43C7-BDDA-E6EB4AEF0511@hortonworks.com> <0FBF4D38-CAA3-4EC3-B1C4-31B1A55A0F86@hortonworks.com> <34549DB1-BAC6-4C3D-ADD1-EC0BD7192823@hortonworks.com> <295ac8ba92eb4387b541b1cfd25c9156@BLUPR03MB184.namprd03.prod.outlook.com> <51F1FE13-F099-4AAD-A973-3D1293D896A1@hortonworks.com> To: common-dev@hadoop.apache.org X-Mailer: Apple Mail (2.1485) X-Virus-Checked: Checked by ClamAV on apache.org That sounds perfect! I have been thinking of late that we would maybe need an incubator = project or something for this - which would be unfortunate. This would allow us to move much more quickly with a set of patches = broken up into consumable/understandable chunks that are made functional = more easily within the branch. I assume that we need to start a separate thread for DISCUSS or VOTE to = start that process - correct? On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur = wrote: > yep, that is what I meant. Thanks Chris >=20 >=20 > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth = wrote: >=20 >> Perhaps this is also a good opportunity to try out the new "branch >> committers" clause in the bylaws, enabling non-committers who are = working >> on this to commit to the feature branch. >>=20 >>=20 >> = http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCAC= O5Y4we4d8knB_xU3a=3Dhr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E >>=20 >> Chris Nauroth >> Hortonworks >> http://hortonworks.com/ >>=20 >>=20 >>=20 >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur >> wrote: >>=20 >>> Larry, >>>=20 >>> Sorry for the delay answering. Thanks for laying down things, yes, = it >> makes >>> sense. >>>=20 >>> Given the large scope of the changes, number of JIRAs and number of >>> developers involved, wouldn't make sense to create a feature branch = for >> all >>> this work not to destabilize (more ;) trunk? >>>=20 >>> Thanks again. >>>=20 >>>=20 >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay = >>> wrote: >>>=20 >>>> The following JIRA was filed to provide a token and basic authority >>>> implementation for this effort: >>>> https://issues.apache.org/jira/browse/HADOOP-9781 >>>>=20 >>>> I have attached an initial patch though have yet to submit it as = one >>> since >>>> it is dependent on the patch for CMF that was posted to: >>>> https://issues.apache.org/jira/browse/HADOOP-9534 >>>> and this patch still has a couple outstanding issues - javac = warnings >> for >>>> com.sun classes for certification generation and 11 javadoc = warnings. >>>>=20 >>>> Please feel free to review the patches and raise any questions or >>> concerns >>>> related to them. >>>>=20 >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay >> wrote: >>>>=20 >>>>> Hello All - >>>>>=20 >>>>> In an effort to scope an initial iteration that provides value to = the >>>> community while focusing on the pluggable authentication aspects, = I've >>>> written a description for "Iteration 1". It identifies the goal of = the >>>> iteration, the endstate and a set of initial usecases. It also >> enumerates >>>> the components that are required for each usecase. There is a scope >>> section >>>> that details specific things that should be kept out of the first >>>> iteration. This is certainly up for discussion. There may be some = of >>> these >>>> things that can be contributed in short order. If we can add some >> things >>> in >>>> without unnecessary complexity for the identified usecases then we >>> should. >>>>>=20 >>>>> @Alejandro - please review this and see whether it satisfies your >> point >>>> for a definition of what we are building. >>>>>=20 >>>>> In addition to the document that I will paste here as text and >> attach a >>>> pdf version, we have a couple patches for components that are >> identified >>> in >>>> the document. >>>>> Specifically, COMP-7 and COMP-8. >>>>>=20 >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was >> filed >>>> specifically for that functionality. >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as the >> token >>>> format and a basic JsonWebTokenAuthority that can issue and verify >> these >>>> tokens. >>>>>=20 >>>>> Since there is no JIRA for this yet, I will likely file a new JIRA >> for >>> a >>>> SSO token implementation. >>>>>=20 >>>>> Both of these patches assume to be modules within >>>> hadoop-common/hadoop-common-project. >>>>> While they are relatively small, I think that they will be pulled = in >> by >>>> other modules such as hadoop-auth which would likely not want a >>> dependency >>>> on something larger like >>> hadoop-common/hadoop-common-project/hadoop-common. >>>>>=20 >>>>> This is certainly something that we should discuss within the >> community >>>> for this effort though - that being, exactly how to add these = libraries >>> so >>>> that they are most easily consumed by existing projects. >>>>>=20 >>>>> Anyway, the following is the Iteration-1 document - it is also >> attached >>>> as a pdf: >>>>>=20 >>>>> Iteration 1: Pluggable User Authentication and Federation >>>>>=20 >>>>> Introduction >>>>> The intent of this effort is to bootstrap the development of >> pluggable >>>> token-based authentication mechanisms to support certain goals of >>>> enterprise authentication integrations. By restricting the scope of >> this >>>> effort, we hope to provide immediate benefit to the community while >>> keeping >>>> the initial contribution to a manageable size that can be easily >>> reviewed, >>>> understood and extended with further development through follow up >> JIRAs >>>> and related iterations. >>>>>=20 >>>>> Iteration Endstate >>>>> Once complete, this effort will have extended the authentication >>>> mechanisms - for all client types - from the existing: Simple, = Kerberos >>> and >>>> Plain (for RPC) to include LDAP authentication and SAML based >> federation. >>>> In addition, the ability to provide additional/custom = authentication >>>> mechanisms will be enabled for users to plug in their preferred >>> mechanisms. >>>>>=20 >>>>> Project Scope >>>>> The scope of this effort is a subset of the features covered by = the >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates = on >>>> enabling Hadoop to issue, accept/validate SSO tokens of its own. = The >>>> pluggable authentication mechanism within SASL/RPC layer and the >>>> authentication filter pluggability for REST and UI components will = be >>>> leveraged and extended to support the results of this effort. >>>>>=20 >>>>> Out of Scope >>>>> In order to scope the initial deliverable as the minimally viable >>>> product, a handful of things have been simplified or left out of = scope >>> for >>>> this effort. This is not meant to say that these aspects are not = useful >>> or >>>> not needed but that they are not necessary for this iteration. We = do >>>> however need to ensure that we don=92t do anything to preclude = adding >> them >>> in >>>> future iterations. >>>>> 1. Additional Attributes - the result of authentication will = continue >>> to >>>> use the existing hadoop tokens and identity representations. = Additional >>>> attributes used for finer grained authorization decisions will be = added >>>> through follow-up efforts. >>>>> 2. Token revocation - the ability to revoke issued identity tokens >> will >>>> be added later >>>>> 3. Multi-factor authentication - this will likely require = additional >>>> attributes and is not necessary for this iteration. >>>>> 4. Authorization changes - we will require additional attributes = for >>> the >>>> fine-grained access control plans. This is not needed for this >> iteration. >>>>> 5. Domains - we assume a single flat domain for all users >>>>> 6. Kinit alternative - we can leverage existing REST clients such = as >>>> cURL to retrieve tokens through authentication and federation for = the >>> time >>>> being >>>>> 7. A specific authentication framework isn=92t really necessary = within >>> the >>>> REST endpoints for this iteration. If one is available then we can = use >> it >>>> otherwise we can leverage existing things like Apache Shiro within = a >>>> servlet filter. >>>>>=20 >>>>> In Scope >>>>> What is in scope for this effort is defined by the usecases = described >>>> below. Components required for supporting the usecases are = summarized >> for >>>> each client type. Each component is a candidate for a JIRA subtask = - >>> though >>>> multiple components are likely to be included in a JIRA to = represent a >>> set >>>> of functionality rather than individual JIRAs per component. >>>>>=20 >>>>> Terminology and Naming >>>>> The terms and names of components within this document are merely >>>> descriptive of the functionality that they represent. Any = similarity or >>>> difference in names or terms from those that are found in other >> documents >>>> are not intended to make any statement about those other documents = or >> the >>>> descriptions within. This document represents the pluggable >>> authentication >>>> mechanisms and server functionality required to replace Kerberos. >>>>>=20 >>>>> Ultimately, the naming of the implementation classes will be a >> product >>>> of the patches accepted by the community. >>>>>=20 >>>>> Usecases: >>>>> client types: REST, CLI, UI >>>>> authentication types: Simple, Kerberos, authentication/LDAP, >>>> federation/SAML >>>>>=20 >>>>> Simple and Kerberos >>>>> Simple and Kerberos usecases continue to work as they do today. = The >>>> addition of Authentication/LDAP and Federation/SAML are added = through >> the >>>> existing pluggability points either as they are or with required >>> extension. >>>> Either way, continued support for Simple and Kerberos must not = require >>>> changes to existing deployments in the field as a result of this >> effort. >>>>>=20 >>>>> REST >>>>> USECASE REST-1 Authentication/LDAP: >>>>> For REST clients, we will provide the ability to: >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint = exposed >> by >>>> an AuthenticationServer instance via REST calls to: >>>>> a. authenticate - passing username/password returning a hadoop >>>> id_token >>>>> b. get-access-token - from the TokenGrantingService by passing = the >>>> hadoop id_token as an Authorization: Bearer token along with the >> desired >>>> service name (master service name) returning a hadoop access token >>>>> 2. Successfully invoke a hadoop service REST API passing the = hadoop >>>> access token through an HTTP header as an Authorization Bearer = token >>>>> a. validation of the incoming token on the service endpoint is >>>> accomplished by an SSOAuthenticationHandler >>>>> 3. Successfully block access to a REST resource when presenting a >>> hadoop >>>> access token intended for a different service >>>>> a. validation of the incoming token on the service endpoint is >>>> accomplished by an SSOAuthenticationHandler >>>>>=20 >>>>> USECASE REST-2 Federation/SAML: >>>>> We will also provide federation capabilities for REST clients such >>> that: >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) = and >>>> persist in a permissions protected file - ie. >> ~/.hadoop_tokens/.idp_token >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP >>>> endpoint exposed by an AuthenticationServer(FederationServer?) = instance >>> via >>>> REST calls to: >>>>> a. federate - passing a SAML assertion as an Authorization: = Bearer >>>> token returning a hadoop id_token >>>>> - can copy and paste from commandline or use cat to include >>>> persisted token through "--Header Authorization: Bearer 'cat >>>> ~/.hadoop_tokens/.id_token'" >>>>> b. get-access-token - from the TokenGrantingService by passing = the >>>> hadoop id_token as an Authorization: Bearer token along with the >> desired >>>> service name (master service name) to the TokenGrantingService >> returning >>> a >>>> hadoop access token >>>>> 3. Successfully invoke a hadoop service REST API passing the = hadoop >>>> access token through an HTTP header as an Authorization Bearer = token >>>>> a. validation of the incoming token on the service endpoint is >>>> accomplished by an SSOAuthenticationHandler >>>>> 4. Successfully block access to a REST resource when presenting a >>> hadoop >>>> access token intended for a different service >>>>> a. validation of the incoming token on the service endpoint is >>>> accomplished by an SSOAuthenticationHandler >>>>>=20 >>>>> REQUIRED COMPONENTS for REST USECASES: >>>>> COMP-1. REST client - cURL or similar >>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP = endpoint >>>> example - returning hadoop id_token >>>>> COMP-3. REST endpoint for federation with SAML Bearer token - >>> shibboleth >>>> SP?|OpenSAML? - returning hadoop id_token >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop = access >>>> tokens from hadoop id_tokens >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop = access >>>> tokens >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP? >>>>> COMP-7. hadoop token and authority implementations >>>>> COMP-8. core services for crypto support for signing, verifying = and >> PKI >>>> management >>>>>=20 >>>>> CLI >>>>> USECASE CLI-1 Authentication/LDAP: >>>>> For CLI/RPC clients, we will provide the ability to: >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint = exposed >> by >>>> an AuthenticationServer instance via REST calls to: >>>>> a. authenticate - passing username/password returning a hadoop >>>> id_token >>>>> - for RPC clients we need to persist the returned hadoop >> identity >>>> token in a file protected by fs permissions so that it may be = leveraged >>>> until expiry >>>>> - directing the returned response to a file may suffice for = now >>>> something like ">~/.hadoop_tokens/.id_token" >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service >>>>> a. RPC client negotiates a TokenAuth method through SASL layer, >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is = passed >> as >>>> Authorization: Bearer token to the get-access-token REST endpoint >> exposed >>>> by TokenGrantingService returning a hadoop access token >>>>> b. RPC server side validates the presented hadoop access token = and >>>> continues to serve request >>>>> c. Successfully invoke a hadoop service RPC API >>>>>=20 >>>>> USECASE CLI-2 Federation/SAML: >>>>> For CLI/RPC clients, we will provide the ability to: >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) = and >>>> persist in a permissions protected file - ie. >> ~/.hadoop_tokens/.idp_token >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP >>>> endpoint exposed by an AuthenticationServer(FederationServer?) = instance >>> via >>>> REST calls to: >>>>> a. federate - passing a SAML assertion as an Authorization: = Bearer >>>> token returning a hadoop id_token >>>>> - can copy and paste from commandline or use cat to include >>>> previously persisted token through "--Header Authorization: Bearer = 'cat >>>> ~/.hadoop_tokens/.id_token'" >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service >>>>> a. RPC client negotiates a TokenAuth method through SASL layer, >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is = passed >> as >>>> Authorization: Bearer token to the get-access-token REST endpoint >> exposed >>>> by TokenGrantingService returning a hadoop access token >>>>> b. RPC server side validates the presented hadoop access token = and >>>> continues to serve request >>>>> c. Successfully invoke a hadoop service RPC API >>>>>=20 >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for >>> REST): >>>>> COMP-9. TokenAuth Method negotiation, etc >>>>> COMP-10. Client side implementation to leverage REST endpoint for >>>> acquiring hadoop access tokens given a hadoop id_token >>>>> COMP-11. Server side implementation to validate incoming hadoop >> access >>>> tokens >>>>>=20 >>>>> UI >>>>> Various Hadoop services have their own web UI consoles for >>>> administration and end user interactions. These consoles need to = also >>>> benefit from the pluggability of authentication mechansims to be on = par >>>> with the access control of the cluster REST and RPC APIs. >>>>> Web consoles are protected with an WebSSOAuthenticationHandler = which >>>> will be configured for either authentication or federation. >>>>>=20 >>>>> USECASE UI-1 Authentication/LDAP: >>>>> For the authentication usecase: >>>>> 1. User=92s browser requests access to a UI console page >>>>> 2. WebSSOAuthenticationHandler intercepts the request and = redirects >> the >>>> browser to an IdP web endpoint exposed by the AuthenticationServer >>> passing >>>> the requested url as the redirect_url >>>>> 3. IdP web endpoint presents the user with a FORM over https >>>>> a. user provides username/password and submits the FORM >>>>> 4. AuthenticationServer authenticates the user with provided >>> credentials >>>> against the configured LDAP server and: >>>>> a. leverages a servlet filter or other authentication mechanism >> for >>>> the endpoint and authenticates the user with a simple LDAP bind = with >>>> username and password >>>>> b. acquires a hadoop id_token and uses it to acquire the = required >>>> hadoop access token which is added as a cookie >>>>> c. redirects the browser to the original service UI resource via >> the >>>> provided redirect_url >>>>> 5. WebSSOAuthenticationHandler for the original UI resource >>> interrogates >>>> the incoming request again for an authcookie that contains an = access >>> token >>>> upon finding one: >>>>> a. validates the incoming token >>>>> b. returns the AuthenticationToken as per AuthenticationHandler >>>> contract >>>>> c. AuthenticationFilter adds the hadoop auth cookie with the >>> expected >>>> token >>>>> d. serves requested resource for valid tokens >>>>> e. subsequent requests are handled by the AuthenticationFilter >>>> recognition of the hadoop auth cookie >>>>>=20 >>>>> USECASE UI-2 Federation/SAML: >>>>> For the federation usecase: >>>>> 1. User=92s browser requests access to a UI console page >>>>> 2. WebSSOAuthenticationHandler intercepts the request and = redirects >> the >>>> browser to an SP web endpoint exposed by the AuthenticationServer >> passing >>>> the requested url as the redirect_url. This endpoint: >>>>> a. is dedicated to redirecting to the external IdP passing the >>>> required parameters which may include a redirect_url back to itself = as >>> well >>>> as encoding the original redirect_url so that it can determine it = on >> the >>>> way back to the client >>>>> 3. the IdP: >>>>> a. challenges the user for credentials and authenticates the = user >>>>> b. creates appropriate token/cookie and redirects back to the >>>> AuthenticationServer endpoint >>>>> 4. AuthenticationServer endpoint: >>>>> a. extracts the expected token/cookie from the incoming request >> and >>>> validates it >>>>> b. creates a hadoop id_token >>>>> c. acquires a hadoop access token for the id_token >>>>> d. creates appropriate cookie and redirects back to the original >>>> redirect_url - being the requested resource >>>>> 5. WebSSOAuthenticationHandler for the original UI resource >>> interrogates >>>> the incoming request again for an authcookie that contains an = access >>> token >>>> upon finding one: >>>>> a. validates the incoming token >>>>> b. returns the AuthenticationToken as per AuthenticationHandler >>>> contrac >>>>> c. AuthenticationFilter adds the hadoop auth cookie with the >>> expected >>>> token >>>>> d. serves requested resource for valid tokens >>>>> e. subsequent requests are handled by the AuthenticationFilter >>>> recognition of the hadoop auth cookie >>>>> REQUIRED COMPONENTS for UI USECASES: >>>>> COMP-12. WebSSOAuthenticationHandler >>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM = based >>>> login >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party >>> token >>>> federation >>>>>=20 >>>>>=20 >>>>>=20 >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan < >> Brian.Swan@microsoft.com> >>>> wrote: >>>>> Thanks, Larry. That is what I was trying to say, but you've said = it >>>> better and in more detail. :-) To extract from what you are saying: = "If >>> we >>>> were to reframe the immediate scope to the lowest common = denominator of >>>> what is needed for accepting tokens in authentication plugins then = we >>>> gain... an end-state for the lowest common denominator that enables >> code >>>> patches in the near-term is the best of both worlds." >>>>>=20 >>>>> -Brian >>>>>=20 >>>>> -----Original Message----- >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com] >>>>> Sent: Wednesday, July 10, 2013 10:40 AM >>>>> To: common-dev@hadoop.apache.org >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components >>>>>=20 >>>>> It seems to me that we can have the best of both worlds = here...it's >> all >>>> about the scoping. >>>>>=20 >>>>> If we were to reframe the immediate scope to the lowest common >>>> denominator of what is needed for accepting tokens in = authentication >>>> plugins then we gain: >>>>>=20 >>>>> 1. a very manageable scope to define and agree upon 2. a = deliverable >>>> that should be useful in and of itself 3. a foundation for = community >>>> collaboration that we build on for higher level solutions built on = this >>>> lowest common denominator and experience as a working community >>>>>=20 >>>>> So, to Alejandro's point, perhaps we need to define what would = make >> #2 >>>> above true - this could serve as the "what" we are building instead = of >>> the >>>> "how" to build it. >>>>> Including: >>>>> a. project structure within hadoop-common-project/common-security = or >>> the >>>> like b. the usecases that would need to be enabled to make it a = self >>>> contained and useful contribution - without higher level solutions = c. >> the >>>> JIRA/s for contributing patches d. what specific patches will be = needed >>> to >>>> accomplished the usecases in #b >>>>>=20 >>>>> In other words, an end-state for the lowest common denominator = that >>>> enables code patches in the near-term is the best of both worlds. >>>>>=20 >>>>> I think this may be a good way to bootstrap the collaboration = process >>>> for our emerging security community rather than trying to tackle a = huge >>>> vision all at once. >>>>>=20 >>>>> @Alejandro - if you have something else in mind that would = bootstrap >>>> this process - that would great - please advise. >>>>>=20 >>>>> thoughts? >>>>>=20 >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan >>>> wrote: >>>>>=20 >>>>>> Hi Alejandro, all- >>>>>>=20 >>>>>> There seems to be agreement on the broad stroke description of = the >>>> components needed to achieve pluggable token authentication (I'm = sure >>> I'll >>>> be corrected if that isn't the case). However, discussion of the >> details >>> of >>>> those components doesn't seem to be moving forward. I think this is >>> because >>>> the details are really best understood through code. I also see *a* >> (i.e. >>>> one of many possible) token format and pluggable authentication >>> mechanisms >>>> within the RPC layer as components that can have immediate benefit = to >>>> Hadoop users AND still allow flexibility in the larger design. So, = I >>> think >>>> the best way to move the conversation of "what we are aiming for" >> forward >>>> is to start looking at code for these components. I am especially >>>> interested in moving forward with pluggable authentication = mechanisms >>>> within the RPC layer and would love to see what others have done in >> this >>>> area (if anything). >>>>>>=20 >>>>>> Thanks. >>>>>>=20 >>>>>> -Brian >>>>>>=20 >>>>>> -----Original Message----- >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com] >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM >>>>>> To: Larry McCay >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components >>>>>>=20 >>>>>> Larry, all, >>>>>>=20 >>>>>> Still is not clear to me what is the end state we are aiming for, >> or >>>> that we even agree on that. >>>>>>=20 >>>>>> IMO, Instead trying to agree what to do, we should first agree = on >>> the >>>> final state, then we see what should be changed to there there, = then we >>> see >>>> how we change things to get there. >>>>>>=20 >>>>>> The different documents out there focus more on how. >>>>>>=20 >>>>>> We not try to say how before we know what. >>>>>>=20 >>>>>> Thx. >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay < >> lmccay@hortonworks.com >>>>=20 >>>> wrote: >>>>>>=20 >>>>>>> All - >>>>>>>=20 >>>>>>> After combing through this thread - as well as the summit = session >>>>>>> summary thread, I think that we have the following two items = that >> we >>>>>>> can probably move forward with: >>>>>>>=20 >>>>>>> 1. TokenAuth method - assuming this means the pluggable >>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai and >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and = myself) >>>>>>>=20 >>>>>>> I propose that we attack both of these aspects as one. Let's >> provide >>>>>>> the structure and interfaces of the pluggable framework for use = in >>>>>>> the RPC layer through leveraging Daryn's pluggability work and = POC >>> it >>>>>>> with a particular token format (not necessarily the only format >> ever >>>>>>> supported - we just need one to start). If there has already = been >>>>>>> work done in this area by anyone then please speak up and commit >> to >>>>>>> providing a patch - so that we don't duplicate effort. >>>>>>>=20 >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we can >> look >>>>>>> at to discern the pluggability mechanism details? Documentation = of >>> it >>>>>>> would be great as well. >>>>>>> @Kai - do you have existing code for the pluggable token >>>>>>> authentication mechanism - if not, we can take a stab at >>> representing >>>>>>> it with interfaces and/or POC code. >>>>>>> I can standup and say that we have a token format that we have >> been >>>>>>> working with already and can provide a patch that represents it >> as a >>>>>>> contribution to test out the pluggable tokenAuth. >>>>>>>=20 >>>>>>> These patches will provide progress toward code being the = central >>>>>>> discussion vehicle. As a community, we can then incrementally >> build >>>>>>> on that foundation in order to collaboratively deliver the = common >>>> vision. >>>>>>>=20 >>>>>>> In the absence of any other home for posting such patches, let's >>>>>>> assume that they will be attached to HADOOP-9392 - or a = dedicated >>>>>>> subtask for this particular aspect/s - I will leave that detail = to >>>> Kai. >>>>>>>=20 >>>>>>> @Alejandro, being the only voice on this thread that isn't >>>>>>> represented in the votes above, please feel free to agree or >>> disagree >>>> with this direction. >>>>>>>=20 >>>>>>> thanks, >>>>>>>=20 >>>>>>> --larry >>>>>>>=20 >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay >>>> wrote: >>>>>>>=20 >>>>>>>> Hi Andy - >>>>>>>>=20 >>>>>>>>> Happy Fourth of July to you and yours. >>>>>>>>=20 >>>>>>>> Same to you and yours. :-) >>>>>>>> We had some fun in the sun for a change - we've had nothing but >>> rain >>>>>>>> on >>>>>>> the east coast lately. >>>>>>>>=20 >>>>>>>>> My concern here is there may have been a misinterpretation or >> lack >>>>>>>>> of consensus on what is meant by "clean slate" >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> Apparently so. >>>>>>>> On the pre-summit call, I stated that I was interested in >>>>>>>> reconciling >>>>>>> the jiras so that we had one to work from. >>>>>>>>=20 >>>>>>>> You recommended that we set them aside for the time being - = with >>> the >>>>>>> understanding that work would continue on your side (and our's = as >>>>>>> well) - and approach the community discussion from a clean = slate. >>>>>>>> We seemed to do this at the summit session quite well. >>>>>>>> It was my understanding that this community discussion would = live >>>>>>>> beyond >>>>>>> the summit and continue on this list. >>>>>>>>=20 >>>>>>>> While closing the summit session we agreed to follow up on >>>>>>>> common-dev >>>>>>> with first a summary then a discussion of the moving parts. >>>>>>>>=20 >>>>>>>> I never expected the previous work to be abandoned and fully >>>>>>>> expected it >>>>>>> to inform the discussion that happened here. >>>>>>>>=20 >>>>>>>> If you would like to reframe what clean slate was supposed to >> mean >>>>>>>> or >>>>>>> describe what it means now - that would be welcome - before I >> waste >>>>>>> anymore time trying to facilitate a community discussion that is >>>>>>> apparently not wanted. >>>>>>>>=20 >>>>>>>>> Nowhere in this >>>>>>>>> picture are self appointed "master JIRAs" and such, which have >>> been >>>>>>>>> disappointing to see crop up, we should be collaboratively >> coding >>>>>>>>> not planting flags. >>>>>>>>=20 >>>>>>>> I don't know what you mean by self-appointed master JIRAs. >>>>>>>> It has certainly not been anyone's intention to disappoint. >>>>>>>> Any mention of a new JIRA was just to have a clear context to >>> gather >>>>>>>> the >>>>>>> agreed upon points - previous and/or existing JIRAs would easily >> be >>>> linked. >>>>>>>>=20 >>>>>>>> Planting flags... I need to go back and read my discussion = point >>>>>>>> about the >>>>>>> JIRA and see how this is the impression that was made. >>>>>>>> That is not how I define success. The only flags that count is >>> code. >>>>>>> What we are lacking is the roadmap on which to put the code. >>>>>>>>=20 >>>>>>>>> I read Kai's latest document as something approaching today's >>>>>>>>> consensus >>>>>>> (or >>>>>>>>> at least a common point of view?) rather than a historical >>> document. >>>>>>>>> Perhaps he and it can be given equal share of the = consideration. >>>>>>>>=20 >>>>>>>> I definitely read it as something that has evolved into = something >>>>>>> approaching what we have been talking about so far. There has = not >>>>>>> however been enough discussion anywhere near the level of detail >> in >>>>>>> that document and more details are needed for each component in >> the >>>> design. >>>>>>>> Why the work in that document should not be fed into the >> community >>>>>>> discussion as anyone else's would be - I fail to understand. >>>>>>>>=20 >>>>>>>> My suggestion continues to be that you should take that = document >>> and >>>>>>> speak to the inventory of moving parts as we agreed. >>>>>>>> As these are agreed upon, we will ensure that the appropriate >>>>>>>> subtasks >>>>>>> are filed against whatever JIRA is to host them - don't really >> care >>>>>>> much which it is. >>>>>>>>=20 >>>>>>>> I don't really want to continue with two separate JIRAs - as I >>>>>>>> stated >>>>>>> long ago - but until we understand what the pieces are and how >> they >>>>>>> relate then they can't be consolidated. >>>>>>>> Even if 9533 ended up being repurposed as the server instance = of >>> the >>>>>>> work - it should be a subtask of a larger one - if that is to be >>>>>>> 9392, so be it. >>>>>>>> We still need to define all the pieces of the larger picture >> before >>>>>>>> that >>>>>>> can be done. >>>>>>>>=20 >>>>>>>> What I thought was the clean slate approach to the discussion >>> seemed >>>>>>>> a >>>>>>> very reasonable way to make all this happen. >>>>>>>> If you would like to restate what you intended by it or = something >>>>>>>> else >>>>>>> equally as reasonable as a way to move forward that would be >>> awesome. >>>>>>>>=20 >>>>>>>> I will be happy to work toward the roadmap with everyone once = it >> is >>>>>>> articulated, understood and actionable. >>>>>>>> In the meantime, I have work to do. >>>>>>>>=20 >>>>>>>> thanks, >>>>>>>>=20 >>>>>>>> --larry >>>>>>>>=20 >>>>>>>> BTW - I meant to quote you in an earlier response and ended up >>>>>>>> saying it >>>>>>> was Aaron instead. Not sure what happened there. :-) >>>>>>>>=20 >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell = >>>> wrote: >>>>>>>>=20 >>>>>>>>> Hi Larry (and all), >>>>>>>>>=20 >>>>>>>>> Happy Fourth of July to you and yours. >>>>>>>>>=20 >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so = I'd >>>>>>>>> defer >>>>>>> to >>>>>>>>> them on the detailed points. >>>>>>>>>=20 >>>>>>>>> My concern here is there may have been a misinterpretation or >> lack >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully that >> can >>>>>>>>> be >>>>>>> quickly >>>>>>>>> cleared up. Certainly we did not mean ignore all that came >> before. >>>>>>>>> The >>>>>>> idea >>>>>>>>> was to reset discussions to find common ground and new = direction >>>>>>>>> where >>>>>>> we >>>>>>>>> are working together, not in conflict, on an agreed upon set = of >>>>>>>>> design points and tasks. There's been a lot of good discussion >> and >>>>>>>>> design preceeding that we should figure out how to port over. >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" and >>> such, >>>>>>>>> which have been disappointing to see crop up, we should be >>>>>>>>> collaboratively coding not planting flags. >>>>>>>>>=20 >>>>>>>>> I read Kai's latest document as something approaching today's >>>>>>>>> consensus >>>>>>> (or >>>>>>>>> at least a common point of view?) rather than a historical >>> document. >>>>>>>>> Perhaps he and it can be given equal share of the = consideration. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote: >>>>>>>>>=20 >>>>>>>>>> Hey Andrew - >>>>>>>>>>=20 >>>>>>>>>> I largely agree with that statement. >>>>>>>>>> My intention was to let the differences be worked out within >> the >>>>>>>>>> individual components once they were identified and subtasks >>>> created. >>>>>>>>>>=20 >>>>>>>>>> My reference to HSSO was really referring to a SSO *server* >> based >>>>>>> design >>>>>>>>>> which was not clearly articulated in the earlier documents. >>>>>>>>>> We aren't trying to compare and contrast one design over >> another >>>>>>> anymore. >>>>>>>>>>=20 >>>>>>>>>> Let's move this collaboration along as we've mapped out and = the >>>>>>>>>> differences in the details will reveal themselves and be >>> addressed >>>>>>> within >>>>>>>>>> their components. >>>>>>>>>>=20 >>>>>>>>>> I've actually been looking forward to you weighing in on the >>>>>>>>>> actual discussion points in this thread. >>>>>>>>>> Could you do that? >>>>>>>>>>=20 >>>>>>>>>> At this point, I am most interested in your thoughts on a >> single >>>>>>>>>> jira >>>>>>> to >>>>>>>>>> represent all of this work and whether we should start >> discussing >>>>>>>>>> the >>>>>>> SSO >>>>>>>>>> Tokens. >>>>>>>>>> If you think there are discussion points missing from that >> list, >>>>>>>>>> feel >>>>>>> free >>>>>>>>>> to add to it. >>>>>>>>>>=20 >>>>>>>>>> thanks, >>>>>>>>>>=20 >>>>>>>>>> --larry >>>>>>>>>>=20 >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell < >> apurtell@apache.org> >>>>>>> wrote: >>>>>>>>>>=20 >>>>>>>>>>> Hi Larry, >>>>>>>>>>>=20 >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me >> point >>>>>>>>>>> out >>>>>>> that, >>>>>>>>>>> while the differences between the competing JIRAs have been >>>>>>>>>>> reduced >>>>>>> for >>>>>>>>>>> sure, there were some key differences that didn't just >>> disappear. >>>>>>>>>>> Subsequent discussion will make that clear. I also disagree >> with >>>>>>>>>>> your characterization that we have simply endorsed all of = the >>>>>>>>>>> design >>>>>>> decisions >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an inch. = We >>> are >>>>>>> here to >>>>>>>>>>> engage in a collaborative process as peers. I've been >> encouraged >>>>>>>>>>> by >>>>>>> the >>>>>>>>>>> spirit of the discussions up to this point and hope that can >>>>>>>>>>> continue beyond one design summit. >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay >>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>>=20 >>>>>>>>>>>> Hi Kai - >>>>>>>>>>>>=20 >>>>>>>>>>>> I think that I need to clarify something... >>>>>>>>>>>>=20 >>>>>>>>>>>> This is not an update for 9533 but a continuation of the >>>>>>>>>>>> discussions >>>>>>>>>> that >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop. >>>>>>>>>>>> We've agreed to leave our previous designs behind and >> therefore >>>>>>>>>>>> we >>>>>>>>>> aren't >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS approach = or >>> an >>>>>>> HSSO vs >>>>>>>>>>>> TAS discussion. >>>>>>>>>>>>=20 >>>>>>>>>>>> Your latest design revision actually makes it clear that = you >>> are >>>>>>>>>>>> now targeting exactly what was described as HSSO - so >> comparing >>>>>>>>>>>> and >>>>>>>>>> contrasting >>>>>>>>>>>> is not going to add any value. >>>>>>>>>>>>=20 >>>>>>>>>>>> What we need you to do at this point, is to look at those >>>>>>>>>>>> high-level components described on this thread and comment = on >>>>>>>>>>>> whether we need additional components or any that are = listed >>>>>>>>>>>> that don't seem >>>>>>> necessary >>>>>>>>>> to >>>>>>>>>>>> you and why. >>>>>>>>>>>> In other words, we need to define and agree on the work = that >>> has >>>>>>>>>>>> to >>>>>>> be >>>>>>>>>>>> done. >>>>>>>>>>>>=20 >>>>>>>>>>>> We also need to determine those components that need to be >> done >>>>>>> before >>>>>>>>>>>> anything else can be started. >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are >>>>>>>>>>>> central to >>>>>>>>>> all >>>>>>>>>>>> the other components and should probably be defined and = POC'd >>> in >>>>>>> short >>>>>>>>>>>> order. >>>>>>>>>>>>=20 >>>>>>>>>>>> Personally, I think that continuing the separation of 9533 >> and >>>>>>>>>>>> 9392 >>>>>>> will >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be = enough >>>>>>> differences >>>>>>>>>>>> between the two to justify separate jiras anymore. It may = be >>>>>>>>>>>> best to >>>>>>>>>> file a >>>>>>>>>>>> new one that reflects a single vision without the extra = cruft >>>>>>>>>>>> that >>>>>>> has >>>>>>>>>>>> built up in either of the existing ones. We would certainly >>>>>>>>>>>> reference >>>>>>>>>> the >>>>>>>>>>>> existing ones within the new one. This approach would align >>> with >>>>>>>>>>>> the >>>>>>>>>> spirit >>>>>>>>>>>> of the discussions up to this point. >>>>>>>>>>>>=20 >>>>>>>>>>>> I am prepared to start a discussion around the shape of the >> two >>>>>>> Hadoop >>>>>>>>>> SSO >>>>>>>>>>>> tokens: identity and access. If this is what others feel = the >>>>>>>>>>>> next >>>>>>> topic >>>>>>>>>>>> should be. >>>>>>>>>>>> If we can identify a jira home for it, we can do it there - >>>>>>> otherwise we >>>>>>>>>>>> can create another DISCUSS thread for it. >>>>>>>>>>>>=20 >>>>>>>>>>>> thanks, >>>>>>>>>>>>=20 >>>>>>>>>>>> --larry >>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" < >> kai.zheng@intel.com> >>>>>>> wrote: >>>>>>>>>>>>=20 >>>>>>>>>>>>> Hi Larry, >>>>>>>>>>>>>=20 >>>>>>>>>>>>> Thanks for the update. Good to see that with this update = we >>> are >>>>>>>>>>>>> now >>>>>>>>>>>> aligned on most points. >>>>>>>>>>>>>=20 >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. = The >>>>>>>>>>>>> new >>>>>>>>>>>> revision incorporates feedback and suggestions in related >>>>>>>>>>>> discussion >>>>>>>>>> with >>>>>>>>>>>> the community, particularly from Microsoft and others >> attending >>>>>>>>>>>> the Security design lounge session at the Hadoop summit. >>> Summary >>>>>>>>>>>> of the >>>>>>>>>> changes: >>>>>>>>>>>>> 1. Revised the approach to now use two tokens, Identity >>> Token >>>>>>> plus >>>>>>>>>>>> Access Token, particularly considering our authorization >>>>>>>>>>>> framework >>>>>>> and >>>>>>>>>>>> compatibility with HSSO; >>>>>>>>>>>>> 2. Introduced Authorization Server (AS) from our >>>> authorization >>>>>>>>>>>> framework into the flow that issues access tokens for = clients >>>>>>>>>>>> with >>>>>>>>>> identity >>>>>>>>>>>> tokens to access services; >>>>>>>>>>>>> 3. Refined proxy access token and the = proxy/impersonation >>>> flow; >>>>>>>>>>>>> 4. Refined the browser web SSO flow regarding access to >>>> Hadoop >>>>>>> web >>>>>>>>>>>> services; >>>>>>>>>>>>> 5. Added Hadoop RPC access flow regard >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> -- >>>>>>>>> Best regards, >>>>>>>>>=20 >>>>>>>>> - Andy >>>>>>>>>=20 >>>>>>>>> Problems worthy of attack prove their worth by hitting back. - >>> Piet >>>>>>>>> Hein (via Tom White) >>>>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> -- >>>>>> Alejandro >>>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> >>>>=20 >>>>=20 >>>=20 >>>=20 >>> -- >>> Alejandro >>>=20 >>=20 >=20 >=20 >=20 > --=20 > Alejandro