hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yongzhi Chen <yc...@cloudera.com>
Subject Re: Review Request 58456: Query cancel: improve the way to handle files
Date Thu, 20 Apr 2017 12:56:42 GMT


> On April 19, 2017, 5:50 p.m., Chaoyu Tang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
> > Lines 46 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692688#file1692688line46>
> >
> >     To be honest, I am not very comfortable to import the Driver here. I thought
the CombineHiveInputFormat in io package is at a lower architecture layer than Driver ql.

> >     Is there any other way which we can detect if the thread has been interrupted
(e.g. Thread.getCurrentThread().isInterrupted() etc?
> >     Also as I recall (if I am right), there might be a class which handles this
interrupt signal globally, I could not find it at this moment.

The check in CombineHiveInputFormat is just to check the threadlocal object, it is following
the same pattern to check hive's own cancel related status. I think we are trying to avoid
use the 
In the CombineHiveInputFormat, it can include:
import org.apache.hadoop.hive.ql.exec.Operator;
import org.apache.hadoop.hive.ql.exec.Utilities;

I do not think it is a problem to import
org.apache.hadoop.hive.ql.Driver


> On April 19, 2017, 5:50 p.m., Chaoyu Tang wrote:
> > service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
> > Lines 399 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692689#file1692689line399>
> >
> >     As I understand, basically the cleanup is called with the parameter state value
CANCELED, TIMEOUT and CLOSED, and here you are trying to address the race issue in the normal
CLOSE case where the thread should not be interrupted and further clean the tmp file. Is it
right?
> >     Another thought, could moving the code
> >     {code}
> >           ss.deleteTmpOutputFile();
> >           ss.deleteTmpErrOutputFile();
> >     {code}
> >     from sqlOperation to driver close() or destroy() will be help to solve the problem?

The exception error "Failed to clean-up tmp directories." is from 
Utilities.clearWork(job); from execute, the clearWork cleans the folders used for map and
reduce plan path.
ss.deleteTmpOutputFile(); ss.deleteTmpErrOutputFile(); is to clean the output data tmp folder,
so they are different. 

  /**
   * Temporary file name used to store results of non-Hive commands (e.g., set, dfs)
   * and HiveServer.fetch*() function will read results from this file
   */
  protected File tmpOutputFile;

  /**
   * Temporary file name used to store error output of executing non-Hive commands (e.g.,
set, dfs)
   */
  protected File tmpErrOutputFile;


- Yongzhi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172374
-----------------------------------------------------------


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being
passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption.
By what I observed that the method seems not very effective for scheduled operation, for example,
the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b

>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6

>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b

>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca

> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message