Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 49776D61D for ; Thu, 18 Oct 2012 10:16:22 +0000 (UTC) Received: (qmail 55364 invoked by uid 500); 18 Oct 2012 10:16:18 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 54892 invoked by uid 500); 18 Oct 2012 10:16:10 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 54738 invoked by uid 99); 18 Oct 2012 10:16:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Oct 2012 10:16:05 +0000 Date: Thu, 18 Oct 2012 10:16:05 +0000 (UTC) From: "Luke Lu (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <103827783.63324.1350555365505.JavaMail.jiratomcat@arcas> In-Reply-To: <532273529.7757.1341483335533.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (MAPREDUCE-4398) Fix mapred.system.dir permission error with FairScheduler MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478887#comment-13478887 ] Luke Lu commented on MAPREDUCE-4398: ------------------------------------ The "magic" number of 4 is the default number of job init threads (mapred.jobinit.threads). You have to submit 4 (or precisely mapred.jobinit.threads) or more jobs as the jobtracker user at the same time to make sure the job init thread are initialized as the system user so they can access the mapred.system.dir (for security reasons, it must be 700). Otherwise, some of the job init threads will be initialized as whatever user who first submits a job. This can lead to seemingly more bizarre behavior: some time it works (the job is initialized by one of the system threads) and sometime it doesn't (the job is initialized by one of the user threads). Once you know the root cause, it's pretty trivial to come up with a patch. The default fifo scheduler and capacity scheduler do not have this bug. > Fix mapred.system.dir permission error with FairScheduler > --------------------------------------------------------- > > Key: MAPREDUCE-4398 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4398 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: contrib/fair-share > Affects Versions: 1.0.3 > Reporter: Luke Lu > Assignee: Yu Gao > > Incorrect job initialization logic in FairScheduler causes mysterious intermittent mapred.system.dir permission errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira