hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <alanfga...@gmail.com>
Subject Re: Delete hive partition while executing query.
Date Mon, 06 Jun 2016 17:30:42 GMT
Do you have the system configured to use the DbTxnManager?  See https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
for details on how to set this up.  The transaction manager is what manages locking and makes
sure that your queries don’t stomp each other.

Alan.

> On Jun 6, 2016, at 06:01, Igor Kuzmenko <f1sherox@gmail.com> wrote:
> 
> Hello, I'm trying to find a safe way to delete partition with all data it includes.
> 
> I'm using Hive 1.2.1, Hive JDBC driver 1.2.1 and perform simple test on transactional
table:
> 
> asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections where dt=20151124
and msisdn_last_digit=2", 1);
> Thread.sleep(3000);
> asyncExecute("alter table mobile_connections drop if exists partition (dt=20151124, msisdn_last_digit=2)
purge", 2);
> Thread.sleep(3000);
> asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections where dt=20151124
and msisdn_last_digit=2", 3);
> Thread.sleep(3000);
> asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections where dt=20151124
and msisdn_last_digit=2", 4);
> (full code here)
> 
> I cretate several threads, each execute query async. First is querying partition. Second
drop partition. Others are the same as first. First query takes about 10-15 seconds to complete,
so "alter table" query starts before first query completes.
> As a result i get:
> 	• First query - successfully completes 
> 	• Second query - successfully completes
> 	• Third query - successfully completes
> 	• Fourth query - throw exception:
> java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return
code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1461923723503_0189_1_00,
diagnostics=[Vertex vertex_1461923723503_0189_1_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE,
Vertex Input: mobile_connections initializer failed, vertex=vertex_1461923723503_0189_1_00
[Map 1], java.lang.RuntimeException: serious problem
> 	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1059)
> 	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1086)
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
> 	at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
> 	at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:255)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:248)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:248)
> 	at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:235)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File
hdfs://jupiter.bss:8020/apps/hive/warehouse/mobile_connections/dt=20151124/msisdn_last_digit=2
does not exist.
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> 	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1036)
> 	... 15 more
> Caused by: java.io.FileNotFoundException: File hdfs://jupiter.bss:8020/apps/hive/warehouse/mobile_connections/dt=20151124/msisdn_last_digit=2
does not exist.
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:958)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:937)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:882)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:878)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:878)
> 	at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1694)
> 	at org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:690)
> 	at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:366)
> 	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:648)
> 	at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
> 	... 4 more
> ]Vertex killed, vertexName=Reducer 3, vertexId=vertex_1461923723503_0189_1_02, diagnostics=[Vertex
received Kill in INITED state., Vertex vertex_1461923723503_0189_1_02 [Reducer 3] killed/failed
due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1461923723503_0189_1_01,
diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1461923723503_0189_1_01
[Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE.
failedVertices:1 killedVertices:2
> 	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
> 	at Test$MyRunnable.run(Test.java:54)
> 	at java.lang.Thread.run(Thread.java:745)
> 
> Since I'm using transactional table, I expect, that all queries, executed after partition
drop, will complete successfully with no result.  Am I doing something wrong? Is there other
way to drop partition with data?


Mime
View raw message