hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman" <ekoif...@hortonworks.com>
Subject Re: Review Request 22329: HIVE-7190. WebHCat launcher task failure can cause two concurent user jobs to run
Date Sat, 07 Jun 2014 01:51:03 GMT


> On June 7, 2014, 1:05 a.m., Eugene Koifman wrote:
> > 1. I think webhcat-default.xml should be modified to include the jars that are now
required in templeton.libjars to minimize out-of-the-box config for end users.
> > 2. Is there any test (e2e) that can be added for this? (with reasonable amount of
effort)
> > 3. When you tested that Pig/Hive jobs get properly tagged, you mean you tested that
MR jobs that are generated by Pig/Hive are tagged, correct?

4. Actually, instead of doing 1, could WebHCat dynamically figure out which hadoop version
it's talking to and add only the necessary shim jar, rather than shipping all of them?  It
reduces the amount of config needed.  It would also be better if we can only ship the minimal
set of jars.


- Eugene


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22329/#review44992
-----------------------------------------------------------


On June 6, 2014, 10:02 p.m., Ivan Mitic wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22329/
> -----------------------------------------------------------
> 
> (Updated June 6, 2014, 10:02 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Approach in the patch is similar to what Oozie does to handle this situation. Specifically,
all child map jobs get tagged with the launcher MR job id. On launcher task restart, launcher
queries RM for the list of jobs that have the tag and kills them. After that it moves on to
start the same child job again. Again, similarly to what Oozie does, a new templeton.job.launch.time
property is introduced that captures the launcher job submit timestamp and later used to reduce
the search window when RM is queried. 
> 
> To validate the patch, you will need to add webhcat shim jars to templeton.libjars as
now webhcat launcher also has a dependency on hadoop shims. 
> 
> I have noticed that in case of the SqoopDelegator webhcat currently does not set the
MR delegation token when optionsFile flag is used. This also creates the problem in this scenario.
This looks like something that should be handled via a separate Jira.
> 
> 
> Diffs
> -----
> 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/HiveDelegator.java
23b1c4f 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/JarDelegator.java
41b1dc5 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/LauncherDelegator.java
04a5c6f 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/PigDelegator.java
04e061d 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SqoopDelegator.java
adcd917 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/JobSubmissionConstants.java
a6355a6 
>   hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/tool/LaunchMapper.java
556ee62 
>   shims/0.20S/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim20S.java d3552c1 
>   shims/0.23/src/main/java/org/apache/hadoop/mapred/WebHCatJTShim23.java 5a728b2 
>   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 299e918 
> 
> Diff: https://reviews.apache.org/r/22329/diff/
> 
> 
> Testing
> -------
> 
> I have validated that MR, Pig and Hive jobs do get tagged appropriately. I have also
validated that previous child jobs do get killed on RM failover/task failure.
> 
> 
> Thanks,
> 
> Ivan Mitic
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message