hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <>
Subject [jira] [Commented] (HIVE-7553) avoid the scheduling maintenance window for every jar change
Date Thu, 07 Aug 2014 08:46:11 GMT


Ferdinand Xu commented on HIVE-7553:

I am now working on this issue, but before putting in a patch I want to present the approach
so that I could get some feedback.
To my understand, this issue is attempting to resolve the hot swap jar files for HIVE_AUX_JARS_PATH.
I did some POCs locally. 
The main workflow for jars loader is as below:
1. Read env HIVE_AUX_JARS_PATH from the system and parse it through adding jar files under
the directory one by one 
2. The system classloader loads the jar files in step 1. 
3. When trying to create a UDF based on the added aux jars, FunctionTask will try to get a
class from loaded classes by calling getUdfClass method. From my view, the key factor to solve
this problem is mainly about the classloader. 
For class loader, it has some "limitations" which should be some designs: 
a) when finding a class, it will check the parent classloader first to see whether the class
is loaded and then current classloader. 
b) Classloader did not have the mechanism for us to reload a cached class. 
Based on this, I have come up with the following solutions in three catalogs. 
1. change the order of loading classes
 As mentioned in section 1, auxilary class path is parsed and loaded when hive server2 booting
up. Can we postpone the loading phase until needed which means loading it on the go, namely
creating UDF? In addition, reloading cached jars is handful issue. To resolve it, we should
create a new classloader each time followed by calling the method Thread.setContextClassloader(refreshedCL)
and HiveConf.setClassLoader(refreshedCL) 
2. override the standard classloader
Still keep the current loading order & make the classloader loading child classloader
first and then the parent(still need to create a new classloader on the go) 
3. Others
use OSGi? JRebel? 
If I have anything incorrect, please feel free to figure me out. Thanks!

> avoid the scheduling maintenance window for every jar change
> ------------------------------------------------------------
>                 Key: HIVE-7553
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Ferdinand Xu
>            Assignee: Ferdinand Xu
> When user needs to refresh existing or add a new jar to HS2, it needs to restart it.
As HS2 is service exposed to clients, this requires scheduling maintenance window for every
jar change. It would be great if we could avoid that.

This message was sent by Atlassian JIRA

View raw message