Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 79DCCD1BD for ; Sun, 26 Aug 2012 18:22:51 +0000 (UTC) Received: (qmail 33829 invoked by uid 500); 26 Aug 2012 18:22:46 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 33721 invoked by uid 500); 26 Aug 2012 18:22:46 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 33714 invoked by uid 99); 26 Aug 2012 18:22:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Aug 2012 18:22:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 Aug 2012 18:22:40 +0000 Received: by obbtb18 with SMTP id tb18so5290378obb.35 for ; Sun, 26 Aug 2012 11:22:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=zeVSkgnp44XPdyMxcyubPWqPseo99xOTh9FJSnTbQi0=; b=eww+JeVNSnjQbG1uTSXUA3GnkTNtDS6LTX2+9+o2pG28UOv5efwd9VmjpSgcAK8tG6 ZuitOhRbfapuRd5ZV57fiaQnHecU2eK1YDm3lAODwdYrUlIiyO/m1BlC0WTGwTcTmrIk bcAMiret17ALVWk/hmMuMKrKcyJDI+KnjO9F95dNMVQLYwdPqKQEz4e6U/GfSD0q5oH9 P+FaO8Chf+uYOZNQXaOvFV82+sY33wTjz3oSFA05HwSmETHE6YZELg1ql9xp6Xvd4Eau wC54lor5df2TlbyXGximta8dQxH7sqcfRFnAYz6uVFrg94ph06bQ0vRkVrJyd0XCKfS+ n2yw== Received: by 10.60.26.133 with SMTP id l5mr8629742oeg.60.1346005339945; Sun, 26 Aug 2012 11:22:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.11.168 with HTTP; Sun, 26 Aug 2012 11:21:59 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Sun, 26 Aug 2012 23:51:59 +0530 Message-ID: Subject: Re: Is there a way to turn off MAPREDUCE-2415? To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQlgWRMgHxJ4PEyecQoCamFZpRSPD6K1ElHrCSwS0XU0+ZNEJOXA3mefZtXpN9jKk3GaseyN Yes that is true, it does maintain N events in memory and then flushes them down to disk upon closure. With a reasonable size (2 MB of logs say) I don't see that causing any memory fill-up issues at all, since it does cap (and discard at tail). The other alternative may be to switch down the log level on the task, via mapred.map.child.log.level and/or mapred.reduce.child.log.level set to WARN or ERROR. On Sun, Aug 26, 2012 at 11:37 PM, Koert Kuipers wrote: > Looks like mapred.userlog.limit.kb is implemented by keeping some list in > memory, and the logs are not writting to disk until the job finishes or is > killed. That doesn't sound acceptable to me. > > Well i am not the only one with this problem. See MAPREDUCE-1100 > > > On Sun, Aug 26, 2012 at 1:58 PM, Harsh J wrote: >> >> Hi Koert, >> >> On Sun, Aug 26, 2012 at 11:20 PM, Koert Kuipers wrote: >> > Hey Harsh, >> > Thanks for responding! >> > Would limiting the logging for each task via mapred.userlog.limit.kb be >> > strictly enforced (while the job is running)? That would solve my issue >> > of >> > runaway logging on a job filling up the datanode disks. I would set the >> > limit high since in general i do want to retain logs, just not in case a >> > single rogue job starts producing many gigabytes of logs. >> > Thanks! >> >> It is not strictly enforced such as counter limits are. Exceeding it >> wouldn't fail the task, only cause the extra logged events to not >> appear at all (thereby limiting the size). >> >> > On Sun, Aug 26, 2012 at 1:44 PM, Harsh J wrote: >> >> >> >> Hi Koert, >> >> >> >> To answer on point, there is no turning off this feature. >> >> >> >> Since you don't seem to care much for logs from tasks persisting, >> >> perhaps consider lowering the mapred.userlog.retain.hours to a lower >> >> value than 24 hours (such as 1h)? Or you may even limit the logging >> >> from each task to a certain amount of KB via mapred.userlog.limit.kb, >> >> which is unlimited by default. >> >> >> >> Would either of these work for you? >> >> >> >> On Sun, Aug 26, 2012 at 11:02 PM, Koert Kuipers >> >> wrote: >> >> > We have smaller nodes (4 to 6 disks), and we used to write logs to >> >> > the >> >> > same >> >> > disk as where the OS is. So if that disks goes then i don't really >> >> > care >> >> > about tasktrackers failing. Also, the fact that logs were written to >> >> > a >> >> > single partition meant that i could make sure they would not grow too >> >> > large >> >> > in case someone had too verbose logging on a large job. With >> >> > MAPREDUCE-2415 >> >> > a job that does massive amount of logging can fill up all the >> >> > mapred.local.dir, which in our case are on the same partition as the >> >> > hdfs >> >> > data dirs, so now faulty logging can fill up hdfs storage, which i >> >> > really >> >> > don't like. Any ideas? >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Harsh J >> > >> > >> >> >> >> -- >> Harsh J > > -- Harsh J