hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez
Date Tue, 12 Sep 2017 00:44:02 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162299#comment-16162299
] 

Sergey Shelukhin commented on HIVE-17502:
-----------------------------------------

There may be a bigger problem here.
I think the logic behind this exception is related to a giant try-finally in TezTask after
SessionState.get().getTezSession is called.
In the end of the finally, returnSession is called, which will ignore the non-pool session,
but for a pool session it would put it back in the pool and reset SessionState's Tez session
to null. This is not necessarily the best logic, cause maybe pool session reuse could also
be allowed, but for obvious reasons (limits of the pool) holding a session from others is
undesirable.
However it's beside the point as, unless I'm missing something, the only way for getTezSession
before the block to get an existing pool tez session is if SessionState is used for two Hive
queries in parallel (i.e. SessionState.get.getTezSession is called from one thread while another
is inside of TezTask try-finally block, running some query). Let me know if it's a different
scenario.
This pattern would probably break all this session setting and unsetting, which would need
to be fixed with additional checks; however I think it also has other similar state-related
issues.
[~thejas] do you know if we allow parallel queries with the same SessionState.

> Reuse of default session should not throw an exception in LLAP w/ Tez
> ---------------------------------------------------------------------
>
>                 Key: HIVE-17502
>                 URL: https://issues.apache.org/jira/browse/HIVE-17502
>             Project: Hive
>          Issue Type: Bug
>          Components: llap, Tez
>    Affects Versions: 2.1.1, 2.2.0
>         Environment: HDP 2.6.1.0-129, Hue 4
>            Reporter: Thai Bui
>            Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be skipped mostly
because of this line https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per user. Under
this configuration, a Thrift client will send a request to either reuse or open a new session.
The reuse request could include the session id of a currently used snippet being executed
in Hue, this causes HS2 to throw an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager (TezSessionPoolManager.java:canWorkWithSameSession(512))
- The current user: hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task (TezTask.java:execute(232)) - Failed
to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session sessionId=5b61a578-6336-41c5-860d-9838166f97fe,
queueName=llap, user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591015330ms
should have been returned to the pool
> 	at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> 	at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> 	at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP daemon pool,
a set of pre-determined number of AMs is initialized at setup time. Thus, HS2 should allow
new sessions from a Thrift client to be used out of the pool, or an existing session to be
skipped and an unused session from the pool to be returned. The logic to throw an exception
in the  `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager (TezSessionPoolManager.java:canWorkWithSameSession(533))
- Skipping default session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap,
user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms since it is being
used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible I would
like this patch to be applied to version 2.1, 2.2 and master. Since we are using 2.1 LLAP
in production with Hue 4, this patch is critical to our success.
> Alternatively, if this patch is too broad in scope, I propose adding an option to allow
"skipping of currently used default sessions". With this new option default to "false", existing
behavior won't change unless the option is turned on.
> I will prepare an official path if this change to master &/ the other branches is
acceptable. I'm not an contributor &/ committer, this will be my first time contributing
to Hive and the Apache foundation. Any early review is greatly appreciated, thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message