hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Furcy Pin (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-12258) read/write into same partitioned table + concurrency = deadlock
Date Sat, 24 Oct 2015 10:07:27 GMT
Furcy Pin created HIVE-12258:
--------------------------------

             Summary: read/write into same partitioned table + concurrency = deadlock
                 Key: HIVE-12258
                 URL: https://issues.apache.org/jira/browse/HIVE-12258
             Project: Hive
          Issue Type: Bug
            Reporter: Furcy Pin


When hive.support.concurrency is enabled if you launch a query that reads data from a partition
and writes data into another partition of the same table,
it creates a deadlock. 
The worse part is that once the deadlock is active, you can't query the table until it times
out. 

## How to reproduce :

```sql
CREATE TABLE test_table (id INT) 
PARTITIONED BY (part STRING)
;

INSERT INTO TABLE test_table PARTITION (part="test")
VALUES (1), (2), (3), (4) 
;

INSERT OVERWRITE TABLE test_table PARTITION (part="test2")
SELECT id FROM test_table WHERE part="test1";
```

Nothing happens, and when doing a SHOW LOCKS in another terminal we get :

```
SHOW LOCKS ;
+----------+-----------+------------+------------+-------------+--------------+-----------------+-----------------+----------------+
| lockid   | database  | table      | partition  | lock_state  | lock_type    | transaction_id
 | last_heartbeat  |  acquired_at   |
+----------+-----------+------------+------------+-------------+--------------+-----------------+-----------------+----------------+
| 3765     | default   | test_table | NULL       | WAITING     | SHARED_READ  | NULL     
      | 1440603633148   | NULL           |
| 3765     | default   | test_table | part=test2 | WAITING     | EXCLUSIVE    | NULL     
      | 1440603633148   | NULL           |
+----------+-----------+------------+------------+-------------+--------------+-----------------+-----------------+----------------+
```

This was tested on Hive 1.1.0-cdh5.4.2 but I believe the bug is still presents in 1.2.1.
I could not reproduce it easily locally because it requires a pseudo-distributed setup with
zookeeper to have concurrency enabled.

>From looking at the code I believe the problem comes from the EmbeddedLockManager method

`public List<HiveLock> lock(List<HiveLockObj> objs, int numRetriesForLock, long
sleepTime)`
that keeps trying to acquire two incompatible locks, and ends up failing after 
hive.lock.numretries*hive.lock.sleep.between.retries which by defaut is 100*60s = 100 minutes.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message