hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Tretyakov <itretya...@griddynamics.com>
Subject Re: JobCache directory cleanup
Date Thu, 17 Jan 2013 09:52:41 GMT
Thanks a lot!

That was it. There was following line in our code:
  jobConf.setKeepTaskFilesPattern(".*");


On Fri, Jan 11, 2013 at 2:20 PM, Hemanth Yamijala <yhemanth@thoughtworks.com
> wrote:

> Hmm. Unfortunately, there is another config variable that may be affecting
> this: keep.task.files.pattern
>
> This is set to .* in the job.xml file you sent. I suspect this may be
> causing a problem. Can you please remove this, assuming you have not set it
> intentionally ?
>
> Thanks
> Hemanth
>
>
>
> On Fri, Jan 11, 2013 at 3:28 PM, Ivan Tretyakov <
> itretyakov@griddynamics.com> wrote:
>
>> Thanks for replies!
>>
>> keep.failed.task.files set to false.
>> Config of one of the jobs attached.
>>
>>
>> On Fri, Jan 11, 2013 at 5:44 AM, Hemanth Yamijala <
>> yhemanth@thoughtworks.com> wrote:
>>
>>> Good point. Forgot that one :-)
>>>
>>>
>>> On Thu, Jan 10, 2013 at 10:53 PM, Vinod Kumar Vavilapalli <
>>> vinodkv@hortonworks.com> wrote:
>>>
>>>>
>>>>
>>>> Can you check the job configuration for these ~100 jobs? Do they have
>>>> keep.failed.task.files set to true? If so, these files won't be deleted.
If
>>>> it doesn't, it could be a bug.
>>>>
>>>> Sharing your configs for these jobs will definitely help.
>>>>
>>>> Thanks,
>>>> +Vinod
>>>>
>>>>
>>>> On Wed, Jan 9, 2013 at 6:41 AM, Ivan Tretyakov <
>>>> itretyakov@griddynamics.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I've found that jobcache directory became very large on our cluster,
>>>>> e.g.:
>>>>>
>>>>> # du -sh /data?/mapred/local/taskTracker/user/jobcache
>>>>> 465G    /data1/mapred/local/taskTracker/user/jobcache
>>>>> 464G    /data2/mapred/local/taskTracker/user/jobcache
>>>>> 454G    /data3/mapred/local/taskTracker/user/jobcache
>>>>>
>>>>> And it stores information for about 100 jobs:
>>>>>
>>>>> # ls -1 /data?/mapred/local/taskTracker/persona/jobcache/  | sort |
>>>>> uniq | wc -l
>>>>>
>>>>
>>>
>>
>>
>> --
>> Best Regards
>> Ivan Tretyakov
>>
>> Deployment Engineer
>> Grid Dynamics
>> +7 812 640 38 76
>> Skype: ivan.tretyakov
>> www.griddynamics.com
>> itretyakov@griddynamics.com
>>
>
>


-- 
Best Regards
Ivan Tretyakov

Deployment Engineer
Grid Dynamics
+7 812 640 38 76
Skype: ivan.tretyakov
www.griddynamics.com
itretyakov@griddynamics.com

Mime
View raw message