hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Sabharwal <mo...@cloudera.com>
Subject Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2
Date Thu, 06 Oct 2016 20:56:57 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review151693
-----------------------------------------------------------



Thanks for the changes! Couple more questions.

For readability, please add a blank line before every new block.

Hive follows Sun's convention (except uses 100char line limit instead of 80):
  http://web.archive.org/web/20140228225807/http://www.oracle.com/technetwork/java/codeconventions-150003.pdf


common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java (line 25)
<https://reviews.apache.org/r/52283/#comment220191>

    Is this actually mocked anywhere in the tests ? I see the tests mock the env via a HashMap.
    
    If the use of this interface is just to mock, I'm not sure if there is a good reason to
use in the non-test code. Could we just directly use System.getenv there ? i.e. we can get
rid of this file altogether.



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 2524)
<https://reviews.apache.org/r/52283/#comment220140>

    run using MR or Spark execution engines. This functionality has not been tested with Tez.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 140)
<https://reviews.apache.org/r/52283/#comment220142>

    "needs to be " -> "is" ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 141)
<https://reviews.apache.org/r/52283/#comment220144>

    nit: "therefore not supported"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 144)
<https://reviews.apache.org/r/52283/#comment220145>

    nit: "job" -> "MR Job"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 146 - 151)
<https://reviews.apache.org/r/52283/#comment220148>

    Could you make the following clear:
    
    Either:
    (1) HIVE_SERVER2_JOB_CREDSTORE_LOCATION should be configured AND HIVE_JOB_CREDSTORE_PASSWORD
environment should be set.
    OR
    (2) HADOOP_CREDSTORE_PASSWORD environment should be set.
    
    IOW It's important to state that HIVE_SERVER2_JOB_CREDSTORE_LOCATION and HADOOP_CREDSTORE_PASSWORD
do not work together.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 148)
<https://reviews.apache.org/r/52283/#comment220146>

    "this adds" -> "we use"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 164)
<https://reviews.apache.org/r/52283/#comment220163>

    Please add single blank line before a new block. Same in other places.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 173 - 175)
<https://reviews.apache.org/r/52283/#comment220192>

    any reason we can use System.getenv directly ? Same in other places



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 179)
<https://reviews.apache.org/r/52283/#comment220151>

    If both HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, what happens
? Should we throw an exception ?
    
    What if HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, but HIVE_SERVER2_JOB_CREDSTORE_LOCATION
is set ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 188)
<https://reviews.apache.org/r/52283/#comment220150>

    Is this ok if this is null or empty ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 203 - 204)
<https://reviews.apache.org/r/52283/#comment220155>

    Wondering if this should really be a RuntimeException, since query is pretty much hosed
at this point, correct ? i.e. rollback the stack and abandon this query with an exception.



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java (line 91)
<https://reviews.apache.org/r/52283/#comment220156>

    needed ?



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java (line 98)
<https://reviews.apache.org/r/52283/#comment220158>

    needed ?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java (lines 422 - 426)
<https://reviews.apache.org/r/52283/#comment220165>

    Easier: job.set(Constants.HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG, origKeystoreLocation
== null: "", origKeystoreLocation)
    
    Wait, shouldn't we set the config before submitting the job ?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
<https://reviews.apache.org/r/52283/#comment220166>

    Blank line ok before new blocks or comments



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java (lines 790 - 791)
<https://reviews.apache.org/r/52283/#comment220167>

    are you fixing something that was broken ? JOBCONF_FILENAME was earlier local but you
to want to allow it on any fs ?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java (lines 201 -
202)
<https://reviews.apache.org/r/52283/#comment220179>

    Needs space after //
    
    What happens to HIVE_SERVER2_JOB_CREDSTORE_LOCATION ? Does that need to be set or checked
if it is set ? I mean, do both HIVE_SERVER2_JOB_CREDSTORE_PASSWORD_ENVVAR and  HIVE_SERVER2_JOB_CREDSTORE_LOCATION
need to be set for former to be used ?
    
    It's worth clarifying here how Spark implementation differs from MR. I am assuming this
step is in addition to the updateJobCredentialProviders invocation in Spark.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java (line 117)
<https://reviews.apache.org/r/52283/#comment220180>

    //u -> // U...
    
    same in other places



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 20)
<https://reviews.apache.org/r/52283/#comment220181>

    expand all imports



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 84 - 85)
<https://reviews.apache.org/r/52283/#comment220182>

    nit: inline oldCredStoreLocation here and in other places



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 87)
<https://reviews.apache.org/r/52283/#comment220164>

    Please limit lines to max 100 characters. 
    
    Same in other places.



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 90 - 91)
<https://reviews.apache.org/r/52283/#comment220183>

    again, inlined value easier to read



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 103)
<https://reviews.apache.org/r/52283/#comment220184>

    Are we testing the case where both HADOOP_CREDENTIAL_PASSWORD_ENVVAR and HIVE_JOB_CREDSTORE_PASSWORD_ENVVAR_VAL
are not set ?



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java (line 447)
<https://reviews.apache.org/r/52283/#comment220185>

    // -> // A...



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java (lines 450 -
454)
<https://reviews.apache.org/r/52283/#comment220190>

    This pattern of choosing the correct password is repeated couple times in this patch.
Should we write a single utility method for this ?


- Mohit Sabharwal


On Oct. 5, 2016, 9:58 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2016, 9:58 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2.
It also adds support for job-specific credential provider and password which is passed to
the jobs via the job configs when they are launched from HS2. The hive specific credential
provider location is specified by a new configuration property specific to hiveserver2 and
the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08

>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc

>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b

>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b

>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed

>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64

>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd

>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION

>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1

> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message