hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Guo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1334) QD thread should set error code if failing so that the main process for the query could exit soon
Date Wed, 15 Feb 2017 08:42:41 GMT
Paul Guo created HAWQ-1334:
------------------------------

             Summary: QD thread should set error code if failing so that the main process
for the query could exit soon
                 Key: HAWQ-1334
                 URL: https://issues.apache.org/jira/browse/HAWQ-1334
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Dispatcher
            Reporter: Paul Guo
            Assignee: Ed Espino


In QD thread dispmgt_thread_func_run(), if there are failures either due to QE or QD itself,
it will cancel the query and then clean up. The main process for the query need to have the
error code of meleeResults be set so that it soon proceed to cancel the query, else we have
to wait for timeout. Typically dispmgt_thread_func_run() should set the error code, however
I found there are some cases who do not handle this, e.g. if poll() fails with ENOMEM. One
symptom of this issue is that we could sometimes see hang if a query is canceled for some
reasons.

The potential solution is that:

1) We expect each branch jump ("goto error_cleanup") should set proper error code it self.
2) We add a "guard" function in the error_cleanup code to set an error code if it is not set.

In general, the cleanup code in QD seems to be really obscure and not elegant. Maybe we should
file another JIRA to refactor the error handling logic in it. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message