Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: 98.139.253.105 is neither permitted
 nor denied by domain of knoguchi@yahoo-inc.com)
From: Koji Noguchi <knoguchi@yahoo-inc.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Date: Wed, 27 Mar 2013 06:21:49 -0700
Subject: Re: Auto clean DistCache?
Thread-Topic: Auto clean DistCache?
Thread-Index: Ac4q7gngO33oQDUETsCJy3vZfD9LuA==
Message-ID: <4265D509-D8B9-48AA-B7B6-DBC19CA807CA@yahoo-inc.com>
References: 
 <CAPQV63UV80T54mzwWbeyKkZET7gmpcLrS-heGHsdP0-TRRQ2Ng@mail.gmail.com>
 <F5153A42-3BA3-45E6-B432-8B80778C6726@hortonworks.com>
 <5BDD5440-304F-44C9-B512-56EDAD21BC39@apache.org>
 <CADE3u=ZOQ7hMuh60QNXdhDb8P7qFO1OdQchs6-2SX_2onMAJsA@mail.gmail.com>
 <CAPQV63WXzwU55C6K1XQn62A=0XvsH-XmZSApGNye9t+WbJwuoQ@mail.gmail.com>
In-Reply-To: 
 <CAPQV63WXzwU55C6K1XQn62A=0XvsH-XmZSApGNye9t+WbJwuoQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

> Else, I will go for a customed script to delete all directories (and cont=
ent) older than 2 or 3 days=85
>
TaskTracker (or NodeManager in 2.*) keeps the list of dist cache entries in=
 memory.
So if external process (like your script) start deleting dist cache files, =
there would be inconsistency and you'll start seeing task initialization fa=
ilures due to no file found error.

Koji


On Mar 26, 2013, at 9:00 PM, Jean-Marc Spaggiari wrote:

> For the situation I faced I was really a disk space issue, not related
> to the number of files. It was writing on a small partition.
>=20
> I will try with local.cache.size or
> mapreduce.tasktracker.cache.local.size to see if I can keep the final
> total size under 5GB... Else, I will go for a customed script to
> delete all directories (and content) older than 2 or 3 days...
>=20
> Thanks,
>=20
> JM
>=20
> 2013/3/26 Abdelrahman Shettia <ashettia@hortonworks.com>:
>> Let me clarify , If there are lots of files or directories up to 32K (
>> Depending on the user's # of files sys os config) in those distributed c=
ache
>> dirs, The OS will not be able to create any more files/dirs, Thus M-R jo=
bs
>> wont get initiated on those tasktracker machines. Hope this helps.
>>=20
>>=20
>> Thanks
>>=20
>>=20
>> On Tue, Mar 26, 2013 at 1:44 PM, Vinod Kumar Vavilapalli
>> <vinodkv@hortonworks.com> wrote:
>>>=20
>>>=20
>>> All the files are not opened at the same time ever, so you shouldn't se=
e
>>> any "# of open files exceeds error".
>>>=20
>>> Thanks,
>>> +Vinod Kumar Vavilapalli
>>> Hortonworks Inc.
>>> http://hortonworks.com/
>>>=20
>>> On Mar 26, 2013, at 12:53 PM, Abdelrhman Shettia wrote:
>>>=20
>>> Hi JM ,
>>>=20
>>> Actually these dirs need to be purged by a script that keeps the last 2
>>> days worth of files, Otherwise you may run into # of open files exceeds
>>> error.
>>>=20
>>> Thanks
>>>=20
>>>=20
>>> On Mar 25, 2013, at 5:16 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.o=
rg>
>>> wrote:
>>>=20
>>> Hi,
>>>=20
>>>=20
>>> Each time my MR job is run, a directory is created on the TaskTracker
>>>=20
>>> under mapred/local/taskTracker/hadoop/distcache (based on my
>>>=20
>>> configuration).
>>>=20
>>>=20
>>> I looked at the directory today, and it's hosting thousands of
>>>=20
>>> directories and more than 8GB of data there.
>>>=20
>>>=20
>>> Is there a way to automatically delete this directory when the job is
>>> done?
>>>=20
>>>=20
>>> Thanks,
>>>=20
>>>=20
>>> JM
>>>=20
>>>=20
>>>=20
>>=20