hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (JIRA)" <>
Subject [jira] [Commented] (HIVE-6469) skipTrash option in hive command line
Date Sat, 19 Apr 2014 00:35:14 GMT


Ravi Prakash commented on HIVE-6469:

There may be multiple users in each of those environments. So unless we have an isolated "environment"
for each user (which is really unmanageable) the global settings for one user will affect
other users using the same environment. What you are suggesting is much coarser granularity
and would be ops nightmare if I am understanding your solution.

The use case that is being targeted here is that a user may on *1* instance choose to drop
a (possibly big) table without sending it to Trash to avoid filling up her/his quota. We believe
that the default Hive behavior of sending to Trash should be maintained (to prevent accidental
data loss).

It might be worthwhile to be consistent with underlying Hadoop philosophy where users wanting
to get rid of data (via 'hdfs dfs -rm') can choose whether or not to permanently remove that
data (with a '-skipTrash') option. You could make all the same arguments about individual
users caring/not caring about controlling this behavior for that case too.

Do you see any of your customers asking for the global config rather than on-demand flag?
Perhaps that can be a separate JIRA?

> skipTrash option in hive command line
> -------------------------------------
>                 Key: HIVE-6469
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: CLI
>    Affects Versions: 0.12.0
>            Reporter: Jayesh
>             Fix For: 0.12.1
>         Attachments: HIVE-6469.patch
> hive drop table command deletes the data from HDFS warehouse and puts it into Trash.
> Currently there is no way to provide flag to tell warehouse to skip trash while deleting
table data.
> This ticket is to add skipTrash feature in hive command-line, that looks as following.

> hive -e "drop table skipTrash testTable"
> This would be good feature to add, so that user can specify when not to put data into
trash directory and thus not to fill hdfs space instead of relying on trash interval and policy
configuration to take care of disk filling issue.

This message was sent by Atlassian JIRA

View raw message