Thanks a lot Alexander!

What is mapreduce.jobtracker.retiredjobs.cache.size for?
Does cron approach safe for hadoop? Is that only way at the moment?


On Wed, Jan 9, 2013 at 6:50 PM, Alexander Alten-Lorenz <wget.null@gmail.com> wrote:
Hi,

Per default (and not configurable) the logs will be persist for 30 days. This will be configurable in future (https://issues.apache.org/jira/browse/MAPREDUCE-4643).

- Alex

On Jan 9, 2013, at 3:41 PM, Ivan Tretyakov <itretyakov@griddynamics.com> wrote:

> Hello!
>
> I've found that jobcache directory became very large on our cluster, e.g.:
>
> # du -sh /data?/mapred/local/taskTracker/user/jobcache
> 465G    /data1/mapred/local/taskTracker/user/jobcache
> 464G    /data2/mapred/local/taskTracker/user/jobcache
> 454G    /data3/mapred/local/taskTracker/user/jobcache
>
> And it stores information for about 100 jobs:
>
> # ls -1 /data?/mapred/local/taskTracker/persona/jobcache/  | sort | uniq |
> wc -l
>
> I've found that there is following parameter:
>
> <property>
>  <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>  <value>1000</value>
>  <description>The number of retired job status to keep in the cache.
>  </description>
> </property>
>
> So, if I got it right it intended to control job cache size by limiting
> number of jobs to store cache for.
>
> Also, I've seen that some hadoop users uses cron approach to cleanup
> jobcache:
> http://grokbase.com/t/hadoop/common-user/102ax9bze1/cleaning-jobcache-manually
> (
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201002.mbox/%3C99484d561002100143s4404df98qead8f2cf687a76d0@mail.gmail.com%3E
> )
>
> Are there other approaches to control jobcache size?
> What is more correct way to do it?
>
> Thanks in advance!
>
> P.S. We are using CDH 4.1.1.
>
> --
> Best Regards
> Ivan Tretyakov
>
> Deployment Engineer
> Grid Dynamics
> +7 812 640 38 76
> Skype: ivan.tretyakov
> www.griddynamics.com
> itretyakov@griddynamics.com

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF




--
Best Regards
Ivan Tretyakov

Deployment Engineer
Grid Dynamics
+7 812 640 38 76
Skype: ivan.tretyakov
www.griddynamics.com
itretyakov@griddynamics.com