hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Szita <sz...@cloudera.com>
Subject Re: Review Request 56118: DROP TABLE in hive doesn't Throw Error
Date Mon, 13 Feb 2017 15:06:47 GMT


> On Feb. 10, 2017, 7:32 p.m., Vihang Karajgaonkar wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 1679
> > <https://reviews.apache.org/r/56118/diff/2/?file=1623348#file1623348line1679>
> >
> >     I agree with Aihua here. As long as the table metadata is dropped, from the
client's point of view the table does not exist. The filesystem will have stale data because
it could not be deleted successfully, but that stale data unusable anyways without the metadata.
If we want to notify such cases to the client, I think it should be a warning at best and
not an error.

Let's consider the following situation:
-The user creates a table, fills it with some data, then drops it (which fails silently leaving
data behind on disk).
-Then the user decides to recreate the table with a different serde, e.g. Avro format. (or
even another user could create a table with the same name)
-A simple _select * from table_ will fail with the following: _"Error: java.io.IOException:
java.io.IOException: Not a data file. (state=,code=0)"_
-User will get quite confused since they don't know that a previous drop table failure has
caused this 

So as I see it, we should either:
-Remove table from HMS, and give back a very simple exception e.g. "Table definition is deleted,
but some data files remained on disk, please clean up manually" (we either succeed or throw
an exception, in order to signal warning thrift contract would have to be amended which is
an overkill for this issue :) )
-Leave this functionality as is, but add a feature to one of the existing tools (e.g. schematool,
metatool) that can detect orphaned data leftovers on disk by comparing HDFS content with HMS
(obviously this should go to a separate jira)

I would've chosen the first option since it's really simple - just a notification to the user
- and we could even make it configurable e.g. hive.metastore.droptable.verbose=true/false


- Adam


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56118/#review165150
-----------------------------------------------------------


On Feb. 3, 2017, 2:22 p.m., Adam Szita wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56118/
> -----------------------------------------------------------
> 
> (Updated Feb. 3, 2017, 2:22 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Peter Vary, and Sergio Pena.
> 
> 
> Bugs: HIVE-14181
>     https://issues.apache.org/jira/browse/HIVE-14181
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Failure during table drop doesn't throw errors and results in success - some times data
resides in warehouse, but table (meta data) is removed from metastore resulting in incosistency
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 53b9b0c6962c9b1cd2eef1cb71687ec0245cfac3

>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
af125c38236582ba532f5e3de3d2ba724f38b101 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java f8c3c4e48db0df9d6c18801bcd61f9e5dc6eb7c2

> 
> Diff: https://reviews.apache.org/r/56118/diff/
> 
> 
> Testing
> -------
> 
> -Added test case
> -Tested on cluster
> 
> 
> Thanks,
> 
> Adam Szita
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message