hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mithun Radhakrishnan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
Date Thu, 04 Sep 2014 22:56:24 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122135#comment-14122135
] 

Mithun Radhakrishnan commented on HIVE-7100:
--------------------------------------------

Hey, David, et al. I've taken a look at the patch you have so far. (You should see some comments
on RB.) Thanks for working on solving this.

In its current form, patch(.5) attempts to solve the problem for dropTable(), and leaves a
TODO for dropPartitions(). I'd like very much to see the solution extended to dropPartitions().
Did you run into something hard in the partitions-case? (The {{HCatClient}} API would need
to expose PURGE as an option. That won't be difficult.)

Question: Would it be possible to introduce a PURGE.default parameter to TBLPROPERTIES for
a table?
I have users that face the same problem as the one you're solving, but in the context of dropPartitions.
While I approve of the ability to dropPartitions(purge=true) on a per-call basis, I'd also
like the ability to choose the default drop-action (if ifPurge isn't set), on a per table-level.
This way:
# Table-owners can decide whether to spam their ~/.Trash on drop.
# The user wouldn't need to change their Hive script (or Oozie action, or HCatClient call),
to be able to skipTrash.
# AFAICT, it'll not conflict with Sushanth's work on HIVE-6465, which might just store the
new table-semantics in TBLPROPERTIES.

I don't know if the protocol need be complicated:
|| Use-case || {{dropTable(purge=<unset>)}} || {{dropTable(purge=true)}} ||
| Default (e.g. pre-existing tables) | Dropped data goes to ~/.Trash | Trash skipped |
| Tables with PURGE.default=true | Trash skipped | Trash skipped |

When HiveQL language support is added, {{DROP TABLE my_table PURGE}} will call {{dropTable(purge=true)}},
and behave identically.
{{dropPartitions()}} would work in similar fashion.

> Users of hive should be able to specify skipTrash when dropping tables.
> -----------------------------------------------------------------------
>
>                 Key: HIVE-7100
>                 URL: https://issues.apache.org/jira/browse/HIVE-7100
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.13.0
>            Reporter: Ravi Prakash
>            Assignee: Jayesh
>         Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, HIVE-7100.4.patch,
HIVE-7100.5.patch, HIVE-7100.patch
>
>
> Users of our clusters are often running up against their quota limits because of Hive
tables. When they drop tables, they have to then manually delete the files from HDFS using
skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly
when dropping tables.
> We should also be able to provide this functionality without polluting SQL syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message