hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kuien Liu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1567) Unknown process holds the lock causes DROP TABLE hangs forever
Date Wed, 06 Dec 2017 03:44:00 GMT
Kuien Liu created HAWQ-1567:
-------------------------------

             Summary: Unknown process holds the lock causes DROP TABLE hangs forever
                 Key: HAWQ-1567
                 URL: https://issues.apache.org/jira/browse/HAWQ-1567
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Core
            Reporter: Kuien Liu
            Assignee: Radar Lei


On Hawq 2.2.0.0-incubating (Jun 2017), we meet several times that query is hanging for long
time:

# 1. DROP TABLE hangs for tens of minutes, because it waits for AccessExclusiveLock.
# 2. BUT the lock is held by a ghost process ( not alive, and little message  in log file
is availabe to know what's up)

A detailed context is pasted:

postgres=# select procpid, sess_id, usesysid, xact_start, waiting, current_query from pg_stat_activity
where current_query <> '<IDLE>';
 procpid | sess_id | usesysid |          xact_start           | waiting |                
                                                                                         
                                      current_query
---------+---------+----------+-------------------------------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   91321 |  120242 |   328199 | 2017-11-28 14:45:52.631739+08 | t       |  drop table if exists
ads_is_svc_rcv_approval_detail_df

postgres=# select * from pg_locks where pid = 91321;
   locktype    | database | relation | page | tuple | transactionid | classid | objid | objsubid
| transaction |  pid  |        mode         | granted | mppsessionid | mppiswriter | gp_segment_id
---------------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-------+---------------------+---------+--------------+-------------+---------------
 transactionid |          |          |      |       |      21867785 |         |       |  
       |    21867785 | 91321 | ExclusiveLock       | t       |       120242 | f          
|        -10000
 relation      |    16510 |     2608 |      |       |               |         |       |  
       |    21867785 | 91321 | RowExclusiveLock    | t       |       120242 | f          
|        -10000
 relation      |    16510 |     1259 |      |       |               |         |       |  
       |    21867785 | 91321 | RowExclusiveLock    | t       |       120242 | f          
|        -10000
 relation      |    16510 |  3212612 |      |       |               |         |       |  
       |    21867785 | 91321 | AccessExclusiveLock | f       |       120242 | f          
|        -10000
 relation      |    16510 |     1247 |      |       |               |         |       |  
       |    21867785 | 91321 | RowExclusiveLock    | t       |       120242 | f          
|        -10000
(5 rows)

postgres=# select * from pg_locks where relation = 3212612;
 locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid
| transaction |  pid   |        mode         | granted | mppsessionid | mppiswriter | gp_segment_id
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+--------+---------------------+---------+--------------+-------------+---------------
 relation |    16510 |  3212612 |      |       |               |         |       |       
  |    21867785 |  91321 | AccessExclusiveLock | f       |       120242 | f           |  
     -10000
 relation |    16510 |  3212612 |      |       |               |         |       |       
  |           0 | 107940 | AccessShareLock     | t       |       120553 | f           |  
     -10000
(2 rows)

postgres=# select * from pg_stat_activity where procpid = 107940;
 datid | datname | procpid | sess_id | usesysid | usename | current_query | waiting | query_start
| backend_start | client_addr | client_port | application_name | xact_start | waiting_resource
-------+---------+---------+---------+----------+---------+---------------+---------+-------------+---------------+-------------+-------------+------------------+------------+------------------
(0 rows)

postgres=# select * from pg_locks  where pid = 107940 or mppsessionid = 120553;
 locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid
| transaction | pid | mode | granted | mppsessionid | mppiswriter | gp_segment_id
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-----+------+---------+--------------+-------------+---------------
(0 rows)

postgres=# select logtime, logpid, logmessage from hawq_toolkit.__hawq_log_master_latest where
logsession = 'con120553';
            logtime            | logpid  |                   logmessage
-------------------------------+---------+-------------------------------------------------
 2017-11-28 15:14:38.277254+08 | p107940 | clean up communication to resource manager now.
 2017-11-28 15:14:38.322206+08 | p107940 | generateResourceRefreshHeartBeat exits.
 2017-11-28 15:14:38.388077+08 | p107940 | APSARA uninit:DONE
(3 rows)

In the master log file, it is not easy to know who is accessing the relation.

To [~yjin]  and all: any suggestion on it? thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message