Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E35B67333 for ; Sat, 1 Oct 2011 02:20:15 +0000 (UTC) Received: (qmail 3645 invoked by uid 500); 1 Oct 2011 02:20:15 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 3587 invoked by uid 500); 1 Oct 2011 02:20:14 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 3579 invoked by uid 99); 1 Oct 2011 02:20:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Oct 2011 02:20:14 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bigbibguy@gmail.com designates 209.85.210.50 as permitted sender) Received: from [209.85.210.50] (HELO mail-pz0-f50.google.com) (209.85.210.50) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Oct 2011 02:20:08 +0000 Received: by pzk37 with SMTP id 37so5763222pzk.9 for ; Fri, 30 Sep 2011 19:19:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=nFqMQBFMhORWKLtm5b9x6XRS5B1GbZtTX9vH69+zNaA=; b=SLWPoatmLwCCyLmEiNj1Bia53lJvKRfkTtgqOymxLXrCmqRyqt5Xl7LnXDKvhsm7C2 wwh05/JRdTnYyuVH1bIR5kCA0C691XoJ4NPwBsLL/kaNh6YmvJWrPOsBxJfDkzLXUOna uOHmG+ykh8qQlND0B4BNV0HewCCQ45a9lmBuo= MIME-Version: 1.0 Received: by 10.68.34.138 with SMTP id z10mr61958129pbi.105.1317435586896; Fri, 30 Sep 2011 19:19:46 -0700 (PDT) Received: by 10.142.212.18 with HTTP; Fri, 30 Sep 2011 19:19:46 -0700 (PDT) Date: Fri, 30 Sep 2011 19:19:46 -0700 Message-ID: Subject: Hadoop Security - TaskTracker and Active Directory From: bigbibguy father To: mapreduce-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec52163032be13c04ae336012 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec52163032be13c04ae336012 Content-Type: text/plain; charset=ISO-8859-1 We are planning to enable secure Hadoop using Kerberos. Our users reside in the active directory. We read that there are two options to use Kerberos for securing Hadoop. 1) You run Kerberos on machine local to the cluster and create service principals here 2) Use Active Directory itself as the kerberos KDC and create service principals also in Active Directory. It seems cloudera and industry in general recommends option1 of running a local KDC for authernticating service principals. https://ccp.cloudera.com/display/CDHDOC/Integrating+Hadoop+Security+with+Active+Directory I read that the tasktrackers run tasks as the user who submitted the user. In that case , doesn't the TaskTracker nodes need to talk to the Active Directory to get the user details like gid etc ? So does this mean that every node (tasktrackers, job tracker and namenode) will be interacting with the Active Directory anyway ? If so, option 1 doesn't seem to be superior since each node has to talk to two kdc's - local kerberos for authenticating service principals, Active Directory to get the user details and group information . Please correct me if I am wrong in my assumptions. Thanks and Regards, BBG --bcaec52163032be13c04ae336012 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable We are planning to enable secure Hadoop using Kerberos.=A0

Our users reside in the active directory. We read that there are two opt= ions =A0to use Kerberos for securing Hadoop.

1) Yo= u run Kerberos on machine local to the cluster and create service principal= s here
2) Use Active Directory itself as the kerberos KDC and create service = principals also in Active Directory.

It seems clou= dera and industry in general recommends option1 of running a local KDC for = authernticating service principals.

=A0I read that the tasktrackers run tasks as the user who submitted the use= r. In that case , doesn't the TaskTracker nodes need to talk to the Act= ive Directory to get the user details like gid etc ?

So does this mean that every node (tasktrackers, job tracker and namen= ode) =A0will be interacting with the Active Directory anyway ?
If so, option 1 doesn't seem to be superior since each nod= e has to talk to two kdc's - local kerberos for authenticating service = principals, Active Directory to get the user details and group information = .=A0

Please correct me if I am wrong in my assumptions.

Thanks and Regards,

BBG
--bcaec52163032be13c04ae336012--