hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Guo <paul...@gmail.com>
Subject An deadlock issue (between entrydb and QD)
Date Fri, 28 Apr 2017 10:08:55 GMT
While looking at a ctas issue which involves join of some catalog tables we
found that entrydb is not using the catalog tables for the query db thus
the query result will be wrong. Further looking at the issue, it seems that
it is a regression that was introduced by a fix for a deadlock bug HAWQ-512
(Query hang due to deadlock in entrydb catalog access). The deadlock SQLs
below is simpler for reproducing and reading.

DROP TABLE IF EXISTS t;
CREATE TABLE t (key INT, value INT) DISTRIBUTED RANDOMLY;
INSERT INTO t VALUES (1, 0);

BEGIN;
ALTER TABLE t SET DISTRIBUTED BY (key);
SELECT t1.key FROM t AS t1, (SELECT generate_series(1, 2)::INT AS key,
0::INT AS value) AS t2 WHERE t1.value = t2.value;
COMMIT;

It deadlock because:
In the transaction block, after altering table t, the lock is not released
on the QD process ("alter table" execute on QD) while the subsequent join
clause involves entrydb QE and then it hangs on entrydb QE, waiting for the
lock for table t.

So this is a dilemma. If we want entrydb uses the query database, there
could be deadlock, but if you do not use some query involves catalog tables
could be wrong.

In theory, entrydb should really use the query database, else it seems that
there is no easy to do some catalog related query since dispatching the
related system table to QE (or adding motion node?) does not look like
feasible solutions.

So maybe we could find another solution, e.g. if finding it is in the
transaction block, we let planner to dispatch all queries to QEs (e.g. when
we run "alter table" kind of DDL on entrydb), or I'm not sure whether
modifying some lock mechanisms (e.g. no locking logic between QD and
entrydb QE) is feasible.

Any suggestion is welcome.

Thanks.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message