Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A7ED1DEC1 for ; Wed, 17 Oct 2012 17:41:15 +0000 (UTC) Received: (qmail 87375 invoked by uid 500); 17 Oct 2012 17:41:10 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 87284 invoked by uid 500); 17 Oct 2012 17:41:10 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 87276 invoked by uid 99); 17 Oct 2012 17:41:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Oct 2012 17:41:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of silvianhadoop@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-la0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Oct 2012 17:41:06 +0000 Received: by mail-la0-f48.google.com with SMTP id u2so5768320lag.35 for ; Wed, 17 Oct 2012 10:40:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=ZyrgNo1aFi4qFWBU4TsnW+Bz2GtYrVK80vUoZW7IZFg=; b=vmAMFJ42M8rAKx3GXprgmc2A5zKgSvn3LWa59if2OhZCDRDAqj3QJzDv82GWVKhr8s YCFr7XmfpMIeiuUx/uNELkmrW9d62O/qgpu9rgjdIz9xrnq5+RrnNeuKoh91S9jk9w+J 4YrYntT4Exy9vEWeQUQvCjxaFLj3/LaPm9zjeypFmdMyIaIj1D+a0Rs8lmxhwTJfnKvt XIxngvLM6c2vjtKQx53VWVbQxkNZpQLEHqzPTl7TNc8Xd1LxnmvQ3WKoqJH1faGrlcq0 HzR0od6NtQ4IXl68CH2HzLEULqhfXgWY+0kOWYJI2L7J1VG0CmfATfqTQdom8mPtiNku 8YOA== MIME-Version: 1.0 Received: by 10.112.44.132 with SMTP id e4mr7173400lbm.101.1350495644713; Wed, 17 Oct 2012 10:40:44 -0700 (PDT) Received: by 10.114.21.234 with HTTP; Wed, 17 Oct 2012 10:40:44 -0700 (PDT) In-Reply-To: References: Date: Wed, 17 Oct 2012 10:40:44 -0700 Message-ID: Subject: Re: Fair scheduler. From: Patai Sangbutsarakum To: user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Harsh.. i am testing it again according to your last instruction. >> 2. Define your required queues: >>mapred.job.queues set to "default,foo,bar" for example, for 3 queues: >>default, foo and bar. >From http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u4/cluster_setup.h= tml#Configuring+the+Environment+of+the+Hadoop+Daemons I couldn't find "mapred.job.queues" from that link so i have been using mapred.queue.names which might be the case that it is my fault. Please suggest On Wed, Oct 17, 2012 at 8:43 AM, Harsh J wrote: > Hey Robin, > > Thanks for the detailed post. > > Just looked at your older thread, and you're right, the JT does write > into its system dir for users' job info and token files when > initializing the Job. The bug you ran into and the exception+trace you > got makes sense now. > > I just didn't see it on version which Patai seems to be using. I think > if he specifies a proper staging directory, he'll go through, cause > his trace is different than that of MAPREDUCE-4398 (i.e. system dir > vs. staging dir - you had system dir unfortunately). > > On Wed, Oct 17, 2012 at 8:39 PM, Goldstone, Robin J. > wrote: >> Yes, you would think that users shouldn't need to write to >> mapred.system.dir, yet that seems to be the case. I posted details abou= t >> my configuration along with full stack traces last week. I won't re-pos= t >> everything but essentially I have mapred.system.dir defined as a directo= ry >> in HDFS owned by mapred:hadoop. I initially set the permissions to 755 >> but when the job tracker started up it changed the permissions to 700. >> Then when I ran a job as a regular user I got this error: >> >> 12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job initialization >> failed: >> org.apache.hadoop.security.AccessControlException: >> org.apache.hadoop.security.AccessControlException: Permission denied: >> user=3Drobing, access=3DEXECUTE, inode=3D"mapred":mapred:hadoop:rwx-----= - >> >> >> I then manually changed the permissions back to 755 and ran again and go= t >> this error: >> 12/10/09 16:31:30 INFO mapred.JobClient: Job Failed: Job initialization >> failed: >> org.apache.hadoop.security.AccessControlException: >> org.apache.hadoop.security.AccessControlException: Permission denied: >> user=3Drobing, access=3DWRITE, inode=3D"mapred":mapred:hadoop:rwxr-xr-x >> >> I then changed the permissions to 777 and the job ran successfully. Thi= s >> suggests that some process was trying to write to write to >> mapred.system.dir but did not have sufficient permissions. The >> speculation is that this was being attempted under my uid instead of >> mapred. Perhaps it is something else. I welcome your suggestions. >> >> >> For completeness, I also have mapred.jobtracker.staging.root.dir set to >> /user within HDFS. I can verify the staging files are going there but >> something else is still trying to access mapred.system.dir. >> >> Robin Goldstone, LLNL >> >> On 10/17/12 12:00 AM, "Harsh J" wrote: >> >>>Hi, >>> >>>Regular users never write into the mapred.system.dir AFAICT. That >>>directory, is just for the JT to use to mark its presence and to >>>"expose" the distributed filesystem it will be relying on. >>> >>>Users write to their respective staging directories, which lies >>>elsewhere and is per-user. >>> >>>Let me post my environment: >>> >>>- mapred.system.dir (A HDFS Dir for a JT to register itself) set to >>>"/tmp/mapred/system". The /tmp/mapred and /tmp/mapred/system (or >>>whatever you configure it to) is to be owned by mapred:hadoop so that >>>the JT can feel free to reconfigure it. >>> >>>- mapreduce.jobtracker.staging.root.dir (A HDFS dir that represents >>>the parent directory for user's to write their per-user job stage >>>files (JARs, etc.)) is set to "/user". The /user further contains each >>>user's home directories, owned all by them. For example: >>> >>>drwxr-xr-x - harsh harsh 0 2012-09-27 15:51 /user/harsh >>> >>>All staging files from local user 'harsh' are hence written as the >>>proper user under /user/harsh/.staging since that user does have >>>permissions to write there. For any user to access HDFS, they'd need a >>>home directory created on the HDFS by the admin first - and after that >>>things users do under their own directory, will work just fine. The JT >>>would not have to try to create per-user directories. >>> >>>On Wed, Oct 17, 2012 at 5:22 AM, Patai Sangbutsarakum >>> wrote: >>>> Thanks everyone, Seem like i hit the dead end. >>>> It's kind of funny when i read that jira; run it 4 time and everything >>>> will work.. where that magic number from..lol >>>> >>>> respects >>>> >>>> On Tue, Oct 16, 2012 at 4:12 PM, Arpit Gupta >>>>wrote: >>>>> https://issues.apache.org/jira/browse/MAPREDUCE-4398 >>>>> >>>>> is the bug that Robin is referring to. >>>>> >>>>> -- >>>>> Arpit Gupta >>>>> Hortonworks Inc. >>>>> http://hortonworks.com/ >>>>> >>>>> On Oct 16, 2012, at 3:51 PM, "Goldstone, Robin J." >>>>> >>>>> wrote: >>>>> >>>>> This is similar to issues I ran into with permissions/ownership of >>>>> mapred.system.dir when using the fair scheduler. We are instructed t= o >>>>>set >>>>> the ownership of mapred.system.dir to mapred:hadoop and then when the >>>>>job >>>>> tracker starts up (running as user mapred) it explicitly sets the >>>>> permissions on this directory to 700. Meanwhile when I go to run a >>>>>job as >>>>> a regular user, it is trying to write stuff into mapred.system.dir bu= t >>>>>it >>>>> can't due to the ownership/permissions that have been established. >>>>> >>>>> Per discussion with Arpit Gupta, this is a bug with the fair schedule= r >>>>>and >>>>> it appears from your experience that there are similar issues with >>>>> hadoop.tmp.dir. The whole idea of the fair scheduler is to run jobs >>>>>under >>>>> the user's identity rather than as user mapred. This is good from a >>>>> security perspective yet it seems no one bothered to account for this >>>>>in >>>>> terms of the permissions that need to be set in the various >>>>>directories to >>>>> enable this. >>>>> >>>>> Until this is sorted out by the Hadoop developers, I've put my >>>>>attempts to >>>>> use the fair scheduler on hold=C5=A0 >>>>> >>>>> Regards, >>>>> Robin Goldstone, LLNL >>>>> >>>>> On 10/16/12 3:32 PM, "Patai Sangbutsarakum" >>>>> wrote: >>>>> >>>>> Hi Harsh, >>>>> Thanks for breaking it down clearly. I would say i am successful 98% >>>>> from the instruction. >>>>> The 2% is about hadoop.tmp.dir >>>>> >>>>> let's say i have 2 users >>>>> userA is a user that start hdfs and mapred >>>>> userB is a regular user >>>>> >>>>> if i use default value of hadoop.tmp.dir >>>>> /tmp/hadoop-${user.name} >>>>> I can submit job as usersA but not by usersB >>>>> ser=3DuserB, access=3DWRITE, inode=3D"/tmp/hadoop-userA/mapred/stagin= g" >>>>> :userA:supergroup:drwxr-xr-x >>>>> >>>>> i googled around; someone recommended to change hadoop.tmp.dir to >>>>> /tmp/hadoop. >>>>> This way it is almost a yay way; the thing is >>>>> >>>>> if I submit as userA it will create /tmp/hadoop in local machine whic= h >>>>> ownership will be userA.userA, >>>>> and once I tried to submit job from the same machine as userB I will >>>>> get "Error creating temp dir in hadoop.tmp.dir /tmp/hadoop due to >>>>> Permission denied" >>>>> (as because /tmp/hadoop is own by userA.userA). vise versa if I delet= e >>>>> /tmp/hadoop and let the directory be created by userB, userA will not >>>>> be able to submit job. >>>>> >>>>> Which is the right approach i should work with? >>>>> Please suggest >>>>> >>>>> Patai >>>>> >>>>> >>>>> On Mon, Oct 15, 2012 at 3:18 PM, Harsh J wrote: >>>>> >>>>> Hi Patai, >>>>> >>>>> Reply inline. >>>>> >>>>> On Tue, Oct 16, 2012 at 2:57 AM, Patai Sangbutsarakum >>>>> wrote: >>>>> >>>>> Thanks for input, >>>>> >>>>> I am reading the document; i forget to mention that i am on cdh3u4. >>>>> >>>>> >>>>> That version should have the support for all of this. >>>>> >>>>> If you point your poolname property to mapred.job.queue.name, then yo= u >>>>> can leverage the Per-Queue ACLs >>>>> >>>>> >>>>> Is that mean if i plan to 3 pools of fair scheduler, i have to >>>>> configure 3 queues of capacity scheduler. in order to have each pool >>>>> can leverage Per-Queue ACL of each queue.? >>>>> >>>>> >>>>> Queues are not hard-tied into CapacityScheduler. You can have generic >>>>> queues in MR. And FairScheduler can bind its Pool concept into the >>>>> Queue configuration. >>>>> >>>>> All you need to do is the following: >>>>> >>>>> 1. Map FairScheduler pool name to reuse queue names itself: >>>>> >>>>> mapred.fairscheduler.poolnameproperty set to 'mapred.job.queue.name' >>>>> >>>>> 2. Define your required queues: >>>>> >>>>> mapred.job.queues set to "default,foo,bar" for example, for 3 queues: >>>>> default, foo and bar. >>>>> >>>>> 3. Define Submit ACLs for each Queue: >>>>> >>>>> mapred.queue.default.acl-submit-job set to "patai,foobar users,adm" >>>>> (usernames groupnames) >>>>> >>>>> mapred.queue.foo.acl-submit-job set to "spam eggs" >>>>> >>>>> Likewise for remaining queues, as you need it=C5=A0 >>>>> >>>>> 4. Enable ACLs and restart JT. >>>>> >>>>> mapred.acls.enabled set to "true" >>>>> >>>>> 5. Users then use the right API to set queue names before submitting >>>>> jobs, or use -Dmapred.job.queue.name=3Dvalue via CLI (if using Tool): >>>>> >>>>> >>>>>http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobC= on >>>>>f >>>>> .html#setQueueName(java.lang.String) >>>>> >>>>> 6. Done. >>>>> >>>>> Let us know if this works! >>>>> >>>>> -- >>>>> Harsh J >>>>> >>>>> >>>>> >>> >>> >>> >>>-- >>>Harsh J >> > > > > -- > Harsh J