hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth" <ss...@apache.org>
Subject Re: Review Request 41377: HIVE-12528 don't start HS2 Tez sessions in a single thread
Date Sat, 19 Dec 2015 02:00:28 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41377/#review111342
-----------------------------------------------------------


TezSessionState - does that need to be synchronized on various methods like openSession, getSession,
closeSession. 
To me, it seems like access to these variables is not thread safe.

Even something like TezSessionState.queueName. That's setup in the main thread in TezSessionPoolManager.setupSessionPool.
Then it is accessed in separate threads - without being a final field, or protected by synchronzied
blocks.


ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java (line 974)
<https://reviews.apache.org/r/41377/#comment171479>

    General: Some docs mentioning that this method is meant to work across threads and processes,
and how that complicates the notifiers / sleep.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java (line 987)
<https://reviews.apache.org/r/41377/#comment171478>

    Maybe a higher interval or #of attempts, so that most files in the same JVM don't go into
the IOException path.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java (line 1025)
<https://reviews.apache.org/r/41377/#comment171480>

    Seems like this is chceked far too often. Required here. Wonder if it can be removed further
up.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java (line 1027)
<https://reviews.apache.org/r/41377/#comment171481>

    Name of the file in the log will be useful



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java (line 1032)
<https://reviews.apache.org/r/41377/#comment171484>

    waitAttempts isn't actually used. Drop from the api, or use instead of the sleep for loop
in the exception block.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java (line 88)
<https://reviews.apache.org/r/41377/#comment171485>

    Why is this required ?
    
    and why not in the single thread case.



ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java (line 118)
<https://reviews.apache.org/r/41377/#comment171486>

    Is there any part of the test which verifies that threads are being used / sessions are
being setup in parallel. Not sure if there's a good way to verify that though - without setting
up explicit properties for tests only.


- Siddharth Seth


On Dec. 18, 2015, 10:10 p.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41377/
> -----------------------------------------------------------
> 
> (Updated Dec. 18, 2015, 10:10 p.m.)
> 
> 
> Review request for hive, Gunther Hagleitner and Siddharth Seth.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see JIRA
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 36e281a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 6e196e6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 0d84340 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java e1a8041 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java 11c0325 
> 
> Diff: https://reviews.apache.org/r/41377/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message