kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Number of data files and opened file descriptors are not decreasing after DROP TABLE.
Date Thu, 27 Apr 2017 18:28:47 GMT
On Mon, Apr 24, 2017 at 8:06 PM, Jason Heo <jason.heo.sde@gmail.com> wrote:

> Thanks David
>
> Hi Mike. I'm using Kudu 1.3.0 bundled in "Cloudera Express 5.10.0 (#85
> built by jenkins on 20170120-1037 git: aa0b5cd5eceaefe2f971c13ab65702
> 0d96bb842a)"
>
> My concern is that something does not free up cleanly and something wastes
> of my resources. eg) I dropped a 30TB table, but in tablet_data, there are
> still 3TB files. And the output of "lsof" shows that tserver opens 50M
> files. So I emailed to know how to remove unnecessarily files.
>

The leftover space usage could come from a couple of different root causes.
For 1.4 we're working on tools (including the below-mentioned fs-check) to
detect and repair the "orphaned" data usage.


>
> It seems I can't use "kudu fs check" though.
>
> $ kudu fs check
> Invalid argument: unknown command 'check'
> Usage: /path/to/cloudera/parcels/KUDU-1.3.0-1.cdh5.11.0.p0.12/bin/../lib/kudu/bin/kudu
> fs <command> [<args>]
>
> <command> can be one of the following:
>     dump   Dump a Kudu filesystem
>   format   Format a new Kudu filesystem
>
> Then I'll try "kudu fs check" when it will be available in Cloudera Manager
>

Sorry, 'fs check' is coming in 1.4, You can build the 'kudu' tool from
source, though, and run it against a 1.3 cluster.

-Todd


>
> Thanks
>
> 2017-04-25 3:54 GMT+09:00 Mike Percy <mpercy@apache.org>:
>
>> HI Jason,
>> I would strongly recommend upgrading to Kudu 1.3.1 as 1.3.0 has a serious
>> data-loss bug related to re-replication. Please see
>> https://kudu.apache.org/releases/1.3.1/docs/release_notes.html (if you
>> are using the Cloudera version of 1.3.0, no need to worry because it
>> includes the fix for that bug).
>>
>> In 1.3.0 and 1.3.1 you should be able to use the "kudu fs check" tool to
>> see if you have orphaned blocks. If you do, you could use the --repair
>> argument to that tool to repair it if you bring your tablet server offline.
>>
>> That said, Kudu uses hole punching to delete data and the same container
>> files may remain open even after removing data. After dropping tables, you
>> should see disk usage at the file system level drop.
>>
>> I'm not sure that I've answered all your questions. If you have specific
>> concerns, please let us know what you are worried about.
>>
>> Mike
>>
>> On Sun, Apr 23, 2017 at 11:43 PM, Jason Heo <jason.heo.sde@gmail.com>
>> wrote:
>>
>>> Hi.
>>>
>>> Before dropping, there were about 30 tables, 27,000 files in tablet_data
>>>  directory.
>>> I dropped most tables and there is ONLY one table which has 400 tablets
>>> in my test Kudu cluster.
>>> After dropping, there are still 27,000 files in tablet_data directory,
>>> and output of /sbin/lsof is the same before dropping. (kudu tserver
>>> opens almost 50M files)
>>>
>>> I'm curious that this can be resolved using "kudu fs check" which is
>>> available at Kudu 1.4.
>>>
>>> I used Kudu 1.2 when executing `DROP TABLE` and currently using Kudu
>>> 1.3.0
>>>
>>> Regards,
>>>
>>> Jason
>>>
>>>
>>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message