hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes
Date Sat, 15 Nov 2014 00:09:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213120#comment-14213120
] 

Venki Korukanti commented on HIVE-5664:
---------------------------------------

bq. Proposed in current form, a db containing 1000 tables, we will make 1000 trips to metastore
from client while doing cascade drop. That will be expensive, it will be good to avoid if
we can.
In current Hive (without the fix attached to this jira), HiveMetastoreClient is getting all
table names first and for each table name it is getting the {{Table}} object as part of dropTable()
method. So with this fix, we are not making it any more expensive, but I agree that it is
very expensive to make n+1 metastore calls.

bq. I wonder if instead of getAllTables() , if we can use getTables() or listTableNamesByFilter()
to only retrieve base tables and no index tables. Since index tables follow specific naming
pattern, we can construct and pass that pattern to one of the above methods to get only base
tables. This will avoid multiple round trips to metastore from client.
Index tables can have custom names like in {{CREATE INDEX temp_tbl3_idx ON TABLE temp_tbl3(id)
AS 'COMPACT' with DEFERRED REBUILD IN TABLE temp_tbl3_idx_tbl;}}

Couple of other alternatives:
1. Failure is because there is no {{Table}} object is found in {{dropTable}} method for index
tables. Simple fix is to ignore {{NoSuchObjectExceptions}}.
2. Add a method {{getTables(TableType)}} to MetaStore interface to retrieve {{Table}} objects
based on table type. We may need to fetch the tables in batches to avoid memory issues.

> Drop cascade database fails when the db has any tables with indexes
> -------------------------------------------------------------------
>
>                 Key: HIVE-5664
>                 URL: https://issues.apache.org/jira/browse/HIVE-5664
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing, Metastore
>    Affects Versions: 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>         Attachments: HIVE-5664.1.patch.txt, HIVE-5664.2.patch.txt
>
>
> {code}
> CREATE DATABASE db2; 
> USE db2; 
> CREATE TABLE tab1 (id int, name string); 
> CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN TABLE tab1_indx;

> DROP DATABASE db2 CASCADE;
> {code}
> Last DDL fails with the following error:
> {code}
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database
does not exist: db2
> Hive.log has following exception
> 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - org.apache.hadoop.hive.ql.metadata.HiveException:
Database does not exist: db2
>         at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
>         at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915)
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
>         at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
> Caused by: NoSuchObjectException(message:db2.tab1_indx table not found)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
>         at com.sun.proxy.$Proxy7.get_table(Unknown Source)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:546)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>         at com.sun.proxy.$Proxy8.dropDatabase(Unknown Source)
>         at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:284)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3470)
>         ... 18 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message