hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-15645) Tez session pool may restart sessions in a wrong queue
Date Tue, 17 Jan 2017 19:11:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826628#comment-15826628
] 

Sergey Shelukhin edited comment on HIVE-15645 at 1/17/17 7:11 PM:
------------------------------------------------------------------

We had a repro on some cluster that indicates that the patch will fix the problem.
It has to do with config being out of sync with the property. First session gets config and
property correct, but something (I am pretty sure it's the unset in open path) resets the
config. Then the 2nd session (after expiration) gets the property correct but the config is
not set, so it logs as if it is going to correct queue but goes to a wrong (default) queue,
which is what we have observed for a specific session in the cluster. The field is also reset
to null from conf (in a place where I added the warn log), after the log statement about the
queue. The 3rd session (after the 2nd expiration) logs null queue (because the field is also
null now), and goes to the wrong queue, as does every one after that. So, for pool sessions
we set the session into conf every time now. I also added a warn log for the future, and a
null check cause we never expect null queue for pool sessions. To fix this properly the separation
of pool and non-pool sessions that was started at some point needs to be completed, but that's
a major refactoring, not a bugfix.


was (Author: sershe):
We had a repro on some cluster that indicates that the patch will fix the problem.
It has to do with config being out of sync with the property. First session gets config and
property correct, but something (I am pretty sure it's the unset in open path) resets the
config. Then the 2nd session (after expiration) gets the property correct but the config is
not set, so it logs as if it is going to correct queue but goes to wrong queue, which is what
we have observed for a specific session. The field is also reset to null from conf (in a place
where I added the warn log), after the log statement about the queue. The 3rd session (after
the 2nd expiration) logs null queue (because the field is also null now), and goes to the
wrong queue, as does every one after that. So, for pool sessions we set the session into conf
every time now. I also added a warn log for the future, and a null check cause we never expect
null queue for pool sessions. To fix this properly the separation of pool and non-pool sessions
that was started at some point needs to be completed, but that's a major refactoring, not
a bugfix.

> Tez session pool may restart sessions in a wrong queue
> ------------------------------------------------------
>
>                 Key: HIVE-15645
>                 URL: https://issues.apache.org/jira/browse/HIVE-15645
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Carter Shanklin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-15645.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message