hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vaibhav Gumashta" <vgumas...@hortonworks.com>
Subject Re: Review Request 14950: Make JDBC use the new HiveServer2 async execution API by default
Date Sat, 01 Mar 2014 00:41:31 GMT


> On Feb. 18, 2014, 6:41 p.m., Prasad Mujumdar wrote:
> > @Vaibhav, thanks for the patch and following up on the overall async execution line
items!
> > 
> > The patch itself looks fine on the first pass. I do have a high level question on
the approach. The patch enables async execution by defaults on client side and adds synchronous
wait top of that. This IHMO defeats the purpose of the aysnc execution. From the application
point of view, there's no difference in the behavior with or without this patch. The execution
will block till the query execution is complete.
> > The rationale of async execution is to return control back to client immediately
so that the client has an option to perform alternate foreground work while the query is being
processed. Have you considered  blocking in fetch for queries with resultset ? That in would
give more time for server to process the query in parallel while client examines the resultset
description and possible interact with end user etc.
> 
> Vaibhav Gumashta wrote:
>     @Prasad: Thanks a lot for taking a look! I was basically going for a use case when
an error can be detected sooner. For example consider the scenario when a stmt.cancel is called
from a separate thread:
>     Blocking calls:
>     When stmt.execute is called, internally HiveStatement sets a variable stmtHandle,
to refer to the Operation object that it created for the query execution. However, for a blocking
call, the stmtHandle is set only when the call to Operation.run returns. If we call a stmt.cancel
(in a different thread) during this, the internal code will check for the value of stmtHandle
and return immediately if it is null, essentially doing a no-op. 
>     Async calls:
>     Since an async call immediately returns from Operation.run, the stmtHandle is set
in HiveStatement while the query is still executing. This means, that a stmt.cancel can actually
cancel the underlying operation and the original stmt.execute will throw an error when we
poll for the operation status (since the underlying operation handle is gone). This can return
a useful response to the client sooner.
>     
>     However, I agree blocking on fetch would be more efficient. Do you think I can take
that up in a follow up jira since I may not have a lot of time to work on it for Hive 13 release?
>     
>     Thanks again!
> 
> Thejas Nair wrote:
>     Yes, I think it is ok to address that in a follow up jira. This is an optimization,
and the changes are useful even without that. What do you think Prasad ?
>
> 
> Prasad Mujumdar wrote:
>     Sounds good. 
>     @Viabhav, are you planning to log a follow up ticket or I can do that. This will
be especially useful for beeline and normal single thread client. Thanks!

@Prasad: I can do that, but feel free to create one if you're planning to work sooner. Thanks
a lot for reviewing the patch. 


- Vaibhav


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14950/#review34728
-----------------------------------------------------------


On Feb. 20, 2014, 9:18 a.m., Vaibhav Gumashta wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14950/
> -----------------------------------------------------------
> 
> (Updated Feb. 20, 2014, 9:18 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Should be applied on top of:
> HIVE-5217 [Add long polling to asynchronous execution in HiveServer2]
> HIVE-5229 [Better thread management for HiveServer2 async threads]
> HIVE-5230 [Better error reporting by async threads in HiveServer2]
> HIVE-5441 [Async query execution doesn't return resultset status] 
> 
> 
> Diffs
> -----
> 
>   jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java f0d0c77 
> 
> Diff: https://reviews.apache.org/r/14950/diff/
> 
> 
> Testing
> -------
> 
> TestJdbcDriver2
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message