hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Artem Ervits" <are9...@nyp.org>
Subject Re: JobCache directory cleanup
Date Thu, 10 Jan 2013 00:13:50 GMT
Just ran into similar problem. If you compress intermediate data, it will kEep jobcache folder
manageable.


Artem Ervits
Data Analyst
New York Presbyterian Hospital

From: Ivan Tretyakov [mailto:itretyakov@griddynamics.com]
Sent: Wednesday, January 09, 2013 10:22 AM
To: user@hadoop.apache.org <user@hadoop.apache.org>
Subject: Re: JobCache directory cleanup

Thanks a lot Alexander!

What is mapreduce.jobtracker.retiredjobs.cache.size for?
Does cron approach safe for hadoop? Is that only way at the moment?


On Wed, Jan 9, 2013 at 6:50 PM, Alexander Alten-Lorenz <wget.null@gmail.com<mailto:wget.null@gmail.com>>
wrote:
Hi,

Per default (and not configurable) the logs will be persist for 30 days. This will be configurable
in future (https://issues.apache.org/jira/browse/MAPREDUCE-4643).

- Alex

On Jan 9, 2013, at 3:41 PM, Ivan Tretyakov <itretyakov@griddynamics.com<mailto:itretyakov@griddynamics.com>>
wrote:

> Hello!
>
> I've found that jobcache directory became very large on our cluster, e.g.:
>
> # du -sh /data?/mapred/local/taskTracker/user/jobcache
> 465G    /data1/mapred/local/taskTracker/user/jobcache
> 464G    /data2/mapred/local/taskTracker/user/jobcache
> 454G    /data3/mapred/local/taskTracker/user/jobcache
>
> And it stores information for about 100 jobs:
>
> # ls -1 /data?/mapred/local/taskTracker/persona/jobcache/  | sort | uniq |
> wc -l
>
> I've found that there is following parameter:
>
> <property>
>  <name>mapreduce.jobtracker.retiredjobs.cache.size</name>
>  <value>1000</value>
>  <description>The number of retired job status to keep in the cache.
>  </description>
> </property>
>
> So, if I got it right it intended to control job cache size by limiting
> number of jobs to store cache for.
>
> Also, I've seen that some hadoop users uses cron approach to cleanup
> jobcache:
> http://grokbase.com/t/hadoop/common-user/102ax9bze1/cleaning-jobcache-manually
> (
> http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201002.mbox/%3C99484d561002100143s4404df98qead8f2cf687a76d0@mail.gmail.com%3E
> )
>
> Are there other approaches to control jobcache size?
> What is more correct way to do it?
>
> Thanks in advance!
>
> P.S. We are using CDH 4.1.1.
>
> --
> Best Regards
> Ivan Tretyakov
>
> Deployment Engineer
> Grid Dynamics
> +7 812 640 38 76
> Skype: ivan.tretyakov
> www.griddynamics.com<http://www.griddynamics.com>
> itretyakov@griddynamics.com<mailto:itretyakov@griddynamics.com>

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF




--
Best Regards
Ivan Tretyakov

Deployment Engineer
Grid Dynamics
+7 812 640 38 76
Skype: ivan.tretyakov
www.griddynamics.com<http://www.griddynamics.com>
itretyakov@griddynamics.com<mailto:itretyakov@griddynamics.com>


--------------------

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.




--------------------

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.




Mime
View raw message