hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Is there a way to turn off MAPREDUCE-2415?
Date Sun, 26 Aug 2012 19:03:23 GMT
Hi,

On Mon, Aug 27, 2012 at 12:09 AM, Koert Kuipers <koert@tresata.com> wrote:
> Harsh,
>
> I see the problem as follows: Usually we want to have people log what they
> want, as long as they don't threaten the stability of the system.
>
> However every once in a while somebody will submit a job that is overly
> verbose and will generate many gigabytes of logs in minutes. This is
> typically a honest mistake, and the person doesn't realize what is going on
> (why is my job so slow?). Limiting the general logging levels for everyone
> to deal with these mistakes seems ineffective. Telling the person to change
> the logging level for his job will not work either since he/she doesn't
> realize what is going on and certainly didn't know in advance.

I had meant to say you could enforce the logging level on the child
tasks via finalized job options, but yeah that'd be way too
restrictive..

> So all i really want is a very high and hard limit on the log size per job,
> to protect the system. Say many hundreds of megabytes or even gigabytes. But
> when this limit is reached i want to logging to stop from that point on, or
> even the job to be killed. mapred.userlog.limit.kb seems the wrong tool for
> the job.

Hundreds of MB of logs seems too much for a single task to emit. I
believe a good limit is < 10 MB. But yeah, makes sense that one could
want more for different forms of jobs and purposes. For such a
requirement, I agree the limit.kb isn't the right solution. Perhaps
just the retain hours value then.

> Before the logging got moved to the mapred.local.dir i had a limit simply by
> limiting the size of the partition that logging went to.
>
> Anyhow, looks like i will have to wait for MAPRED-1100

I agree.

> Have a good day! Koert
>
> On Sun, Aug 26, 2012 at 2:21 PM, Harsh J <harsh@cloudera.com> wrote:
>>
>> Yes that is true, it does maintain N events in memory and then flushes
>> them down to disk upon closure. With a reasonable size (2 MB of logs
>> say) I don't see that causing any memory fill-up issues at all, since
>> it does cap (and discard at tail).
>>
>> The other alternative may be to switch down the log level on the task,
>> via mapred.map.child.log.level and/or mapred.reduce.child.log.level
>> set to WARN or ERROR.
>>
>> On Sun, Aug 26, 2012 at 11:37 PM, Koert Kuipers <koert@tresata.com> wrote:
>> > Looks like mapred.userlog.limit.kb is implemented by keeping some list
>> > in
>> > memory, and the logs are not writting to disk until the job finishes or
>> > is
>> > killed. That doesn't sound acceptable to me.
>> >
>> > Well i am not the only one with this problem. See MAPREDUCE-1100
>> >
>> >
>> > On Sun, Aug 26, 2012 at 1:58 PM, Harsh J <harsh@cloudera.com> wrote:
>> >>
>> >> Hi Koert,
>> >>
>> >> On Sun, Aug 26, 2012 at 11:20 PM, Koert Kuipers <koert@tresata.com>
>> >> wrote:
>> >> > Hey Harsh,
>> >> > Thanks for responding!
>> >> > Would limiting the logging for each task via mapred.userlog.limit.kb
>> >> > be
>> >> > strictly enforced (while the job is running)? That would solve my
>> >> > issue
>> >> > of
>> >> > runaway logging on a job filling up the datanode disks. I would set
>> >> > the
>> >> > limit high since in general i do want to retain logs, just not in
>> >> > case a
>> >> > single rogue job starts producing many gigabytes of logs.
>> >> > Thanks!
>> >>
>> >> It is not strictly enforced such as counter limits are. Exceeding it
>> >> wouldn't fail the task, only cause the extra logged events to not
>> >> appear at all (thereby limiting the size).
>> >>
>> >> > On Sun, Aug 26, 2012 at 1:44 PM, Harsh J <harsh@cloudera.com>
wrote:
>> >> >>
>> >> >> Hi Koert,
>> >> >>
>> >> >> To answer on point, there is no turning off this feature.
>> >> >>
>> >> >> Since you don't seem to care much for logs from tasks persisting,
>> >> >> perhaps consider lowering the mapred.userlog.retain.hours to a
lower
>> >> >> value than 24 hours (such as 1h)? Or you may even limit the logging
>> >> >> from each task to a certain amount of KB via
>> >> >> mapred.userlog.limit.kb,
>> >> >> which is unlimited by default.
>> >> >>
>> >> >> Would either of these work for you?
>> >> >>
>> >> >> On Sun, Aug 26, 2012 at 11:02 PM, Koert Kuipers <koert@tresata.com>
>> >> >> wrote:
>> >> >> > We have smaller nodes (4 to 6 disks), and we used to write
logs to
>> >> >> > the
>> >> >> > same
>> >> >> > disk as where the OS is. So if that disks goes then i don't
really
>> >> >> > care
>> >> >> > about tasktrackers failing. Also, the fact that logs were
written
>> >> >> > to
>> >> >> > a
>> >> >> > single partition meant that i could make sure they would not
grow
>> >> >> > too
>> >> >> > large
>> >> >> > in case someone had too verbose logging on a large job. With
>> >> >> > MAPREDUCE-2415
>> >> >> > a job that does massive amount of logging can fill up all
the
>> >> >> > mapred.local.dir, which in our case are on the same partition
as
>> >> >> > the
>> >> >> > hdfs
>> >> >> > data dirs, so now faulty logging can fill up hdfs storage,
which i
>> >> >> > really
>> >> >> > don't like. Any ideas?
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Mime
View raw message