falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satish Mittal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-60) Feed retention doesn't delete empty parent dirs
Date Thu, 06 Feb 2014 15:04:10 GMT

    [ https://issues.apache.org/jira/browse/FALCON-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893415#comment-13893415

Satish Mittal commented on FALCON-60:

While testing feed retention with Hcatalog based feed, I observed that empty parent dirs (except
leaf node) are being left behind. E.g. if partition is on <date, country> then an empty
date dir is left behind.

In FeedEvictor.dropPartition(), we have:

boolean deleted = true;
        if (isTableExternal) { // nuke the dirs if an external table
            final String location = partitionToDrop.getLocation();
            final Path path = new Path(location);
            deleted = path.getFileSystem(new Configuration()).delete(path, true);

In case of HCat External Table, following cases arise during partition registration step:
1) If no external location is specified, then we can safely assume that HDFS dirs for the
partition are created by HCat in its 'native' format: key1=value1/key2=value2/.... and delete
bottom-up by those many levels.
2) Else if it is a static partition and external location is specified, then there is no guarantee
that the user-specified HDFS location will always have those many levels or cater to any particular
3) Else if it is a dynamic partition and a custom dynamic path pattern is specified, then
we can go through the pattern and figure out the appropriate level upwards to delete the partition

> Feed retention doesn't delete empty parent dirs
> -----------------------------------------------
>                 Key: FALCON-60
>                 URL: https://issues.apache.org/jira/browse/FALCON-60
>             Project: Falcon
>          Issue Type: Bug
>    Affects Versions: 0.4
>            Reporter: Shwetha G S
>            Assignee: Shaik Idris Ali
>             Fix For: 0.4
>         Attachments: FALCON-60-v2.patch, FALCON-60-v3.patch, FALCON-60.patch

This message was sent by Atlassian JIRA

View raw message