hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: CleanerChore exception
Date Sun, 30 Dec 2012 19:50:32 GMT
The Javadoc is saying:

"@return <tt>true</tt> if the directory was deleted, <tt>false</tt>
otherwise"

So I think the line "return canDeleteThis ? fs.delete(toCheck, false)
: false;" is still correct. It's retuning false if the directory has
not been deleted.

There is no exception here. If the TTL for a file had not expired, the
file can't be deleted and false is returned. I think it's correct
behaviour.

The idea of not passing "true" for the recursivity is explained on the comments:
    // if all the children have been deleted, then we should try to
delete this directory. However,
    // don't do so recursively so we don't delete files that have been
added since we checked.
And I think it's good. So the issue is really when the directory is
empty and listStatus is sending back null. Then if (children == null)
return true; is simply returning true without deleting the current
directory.

This should be changed by something like
if (children == null) return fs.delete(toCheck, false);
Which will try to delete the current directory, return true or false
if possible or not, and throw an expection if there is any issue with
the FS...

I have done some modifications. I'm compiling and will deploy the
updated version on my local cluster soon. I will keep you posted on
the result.

JM

2012/12/30, Jean-Marc Spaggiari <jean-marc@spaggiari.org>:
> Thanks for the confirmation.
>
> Also, seems that there is no test class related to
> checkAndDeleteDirectory. It might be good to add that too.
>
> I have extracted 0.94.3 0.94.4RC0 and the trunk and they are all
> identical for this methode.
>
> I will try to do some modifications and see the results...
>
> So far there is 2 options. One is to change the "return null" to
> handle the current empty directory, and another one is to call
> fs.delete() directly from checkAndDeleteDirectory instead of the
> existing code.
>
> Will wait for Jesse's feedback.
>
> JM
>
> 2012/12/30, Ted Yu <yuzhihong@gmail.com>:
>> Thanks for the digging. This concurs with my suspicion in the beginning.
>>
>> I am copying Jesse who wrote the code. He should have more insight on
>> this.
>>
>> After his confirmation, you can log a JIRA.
>>
>> Cheers
>>
>> On Sun, Dec 30, 2012 at 10:59 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>>
>>> So. Looking deeper I found few things.
>>>
>>> First, why checkAndDeleteDirectory is not "simply" calling
>>> FSUtils.delete (fs, toCheck, true)? I guess it's doing the same thing?
>>>
>>> Also, FSUtils.listStatus(fs, toCheck, null); will return null if there
>>> is no status. Not just an empty array. And it's returning null, we
>>> will exit without calling the delete methode.
>>>
>>> I tried to manually create a file on one of those directories. The
>>> exception disapears for 300 seconds because of the TTL for the newly
>>> created file. After 300 seconds, the file I pushed AND the directory
>>> got removed. So the issue is really with empty directories.
>>>
>>> I will take a look at what is in the trunk and in 0.94.4 to see if
>>> it's the same issue. But I think we can simple change all this code by
>>> a call to FSUtils.delete.
>>>
>>> I can open a JIRA and submit a patch for that. Just let me know.
>>>
>>> JM
>>>
>>> 2012/12/30, Jean-Marc Spaggiari <jean-marc@spaggiari.org>:
>>> > Regargind the logcleaner settings, I have not changed anything. It's
>>> > what came with the initial install. So I don't have anything setup for
>>> > this plugin in my configuration files.
>>> >
>>> > For the files on the FS, here is what I have:
>>> > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls
>>> > /hbase/.archive/entry_duplicate
>>> > Found 30 items
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/0ddf0d1802c6afd97d032fd09ea9e37d
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/18cf7c5c946ddf33e49b227feedfb688
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/2353f10e79dacc5cf201be6a1eb63607
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/243f4007cf05415062010a5650598bff
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/287682333698e36cea1670f5479fbf18
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/3742da9bd798342e638e1ce341f27537
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/435c9c08bc08ed7248a013b6ffaa163b
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/45346b4b4248d77d45e031ea71a1fb63
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/4afe48fe6d8defe569f8632dd2514b07
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/68a4e364fe791a0d1f47febbb41e8112
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/7673d718962535c7b54cef51830f22a5
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/7df6845ae9d052f4eae4a01e39313d61
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/8c5a263167d1b09f645af8efb4545554
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/8c98d9c635ba30d467d127a2ec1c69f8
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/8dfa96393e18ecca826fd9200e6bf68b
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/8e8f532e91a7197cd53b7626130be698
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/8eca1a325fe442a8546e43ac2f00cfef
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/9ad4c0551b90ea7717d7e3aaec76dc26
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/a135ccbc6f61ce544dbd537dc12489e9
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/a3d0332a6d51a8b15b99d1caca3f355a
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/bd2b8c942af27e541e20e430d506d2c0
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/c10c3a66948bde75fc41349108d86cf9
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/cbf2f178691bfca8a7e9825115629b8e
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/d14a2546eaceede73b282e444ad1bb40
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:38
>>> > /hbase/.archive/entry_duplicate/d570a21a39e04ba2ec896bbe7166423c
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/e943bda56acd6beb35bdd56f0560f87f
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/ef5692ba83aba48d9e7a6b9c2cd0661e
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/fd85dd319c289959a790faed32ef1530
>>> > drwxr-xr-x   - hbase supergroup          0 2012-12-10 14:39
>>> > /hbase/.archive/entry_duplicate/ffcdf6554accda1800e74838b67d3004
>>> > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -ls
>>> > /hbase/.archive/entry_duplicate/00c185bc44b6dcf85a90b83bdda4ec2e
>>> > hadoop@node3:~/hadoop-1.0.3$
>>> >
>>> > I  have not lookeqd into ALL the subdirectories, but the 10 first are
>>> > empty.
>>> >
>>> > I see that there is some traces on checkAndDeleteDirectory... I will
>>> > try to activate that and see if there is more details.
>>> >
>>> >
>>> > JM
>>> >
>>> > 2012/12/30, Ted Yu <yuzhihong@gmail.com>:
>>> >> The exception came from this line:
>>> >>           if (file.isDir()) checkAndDeleteDirectory(file.getPath());
>>> >> Looking at checkAndDeleteDirectory(), it recursively deletes files
>>> >> and
>>> >> directories under the specified path.
>>> >>
>>> >> Does /hbase/.archive/entry_duplicate only contain empty directories
>>> >> underneath it ?
>>> >>
>>> >> You didn't modify the logcleaner plugin setting, right ?
>>> >>   <property>
>>> >>     <name>hbase.master.logcleaner.plugins</name>
>>> >>
>>> >>
>>> <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner</value>
>>> >>
>>> >>  </property>
>>> >>
>>> >> Cheers
>>> >>
>>> >> On Sun, Dec 30, 2012 at 9:53 AM, Jean-Marc Spaggiari <
>>> >> jean-marc@spaggiari.org> wrote:
>>> >>
>>> >>> I was going to move to 0.94.4 today ;) And yes I'm using 0.94.3.
I
>>> >>> might wait a bit in case some testing is required with my version.
>>> >>>
>>> >>> Is this what you are looking for? http://pastebin.com/N8Q0FMba
>>> >>>
>>> >>> I will keep the files for now since it seems it's not causing any
>>> >>> major issue. That will allow some more testing if required.
>>> >>>
>>> >>> JM
>>> >>>
>>> >>>
>>> >>> 2012/12/30, Ted Yu <yuzhihong@gmail.com>:
>>> >>> > Looks like you're using 0.94.3
>>> >>> >
>>> >>> > The archiver is backport of:
>>> >>> > HBASE-5547, Don't delete HFiles in backup mode
>>> >>> >
>>> >>> > Can you provide more the log where the IOE was reported using
>>> pastebin
>>> >>> > ?
>>> >>> >
>>> >>> > Thanks
>>> >>> >
>>> >>> > On Sun, Dec 30, 2012 at 9:08 AM, Jean-Marc Spaggiari <
>>> >>> > jean-marc@spaggiari.org> wrote:
>>> >>> >
>>> >>> >> Hi,
>>> >>> >>
>>> >>> >> I have a "IOException" /hbase/.archive/table_name is non
empty
>>> >>> >> exception every minute on my logs.
>>> >>> >>
>>> >>> >> There is 30 directories under this directory. the main
directory
>>> >>> >> is
>>> >>> >> from yesterday, but all sub directories are from December
10th,
>>> >>> >> all
>>> >>> >> the same time.
>>> >>> >>
>>> >>> >> What does this .archive directory is used for, and what
should I
>>> >>> >> do?
>>> >>> >>
>>> >>> >> Thanks,
>>> >>> >>
>>> >>> >> JM
>>> >>> >>
>>> >>> >
>>> >>>
>>> >>
>>> >
>>>
>>
>

Mime
View raw message