hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis Ryu" <navis....@nexr.com>
Subject Re: Review Request 26854: HIVE-2573 Create per-session function registry
Date Fri, 24 Oct 2014 05:41:34 GMT


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java, line 42
> > <https://reviews.apache.org/r/26854/diff/1-3/?file=723906#file723906line42>
> >
> >     Can we replace isNative/isPersistent with an enum that has BUILTIN, PERMANENT,
TEMPORARY (or equivalent terms)?

Sure.


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java, line 128
> > <https://reviews.apache.org/r/26854/diff/1-3/?file=723908#file723908line128>
> >
> >     Why was this check removed? If you are permanent UDFs with Hive CLI, you would
have to make sure the UDF resources are available from the cluster as opposed to just on the
local filesystem of the client that created the UDF.

Seemed removed by mistake. I'll revert that.


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java, line 433
> > <https://reviews.apache.org/r/26854/diff/1-3/?file=723909#file723909line433>
> >
> >     I thought builtin functions aren't allowed to be removed?
> >     Does this mean that we could create a function using the same class as a built-in
function (create a synonym), and deleting this new function will cause this class to be removed
from the builtin set?

This should be removed. Thanks.


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java, line 465
> > <https://reviews.apache.org/r/26854/diff/1-3/?file=723909#file723909line465>
> >
> >     There is no longer a way to query the metastore for UDFs apart from the static
initialization. So if one CLI user creates a permanent UDF, another user on CLI, or HS2, will
not be able to use that new UDF if the 2nd CLI or HS2 was initialized before this UDF was
created.

Permanent functions (persistent function seemed better name, imho) are registered to system
registry, which is shared to all clients. So if one user creates new permanent function, it's
shared to all clients. The time a user accesses the function, the class is loaded with required
resources and registered to session registry as a temporary function.


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java, line 511
> > <https://reviews.apache.org/r/26854/diff/1-3/?file=723909#file723909line511>
> >
> >     I think I see what you're trying do here, trying to add a mechanism so that
if a function is deleted in one session, the other sessions will also see it as discarded
if they try to look it up. But I don't actually see discarded being set to true.

My mistake. Fixed.


> On Oct. 23, 2014, 9:50 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java, line 152
> > <https://reviews.apache.org/r/26854/diff/3/?file=729496#file729496line152>
> >
> >     I don't think it's necessary to pre-emptively query the metastore for permanent
UDFs during initialization. If we have user on Hive CLI, we will automatically lookup metastore/download
UDF resources, when they may not even be using any of these UDFs during their session. How
about we keep the existing behavior that we only look them up when they are used during a
query?
> >     
> >     Also, if we are doing this during static initialization, is Hive be in a state
that it can query the metastore? Not sure if there is any other initialization that may need
to take place beforehand.

bq. we only look them up when they are used during a query?
This method just stores meta information (classname, resources, etc.) for the function without
loading any resources/classes, lessening redundant metastore accesses from all clients. When
user accesses the function it's registered to session registry with the information as described
above. 

bq. Not sure if there is any other initialization that may need to take place beforehand.
Afaik, metastore has static lock for initialization which is checked before client accesses
internal of it. If it does not act like that, we should fix metastore.


- Navis


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26854/#review57952
-----------------------------------------------------------


On Oct. 23, 2014, 12:20 a.m., Jason Dere wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26854/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2014, 12:20 a.m.)
> 
> 
> Review request for hive, Navis Ryu and Thejas Nair.
> 
> 
> Bugs: HIVE-2573
>     https://issues.apache.org/jira/browse/HIVE-2573
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Small updates to Navis' changes:
> - session registry doesn't lookup metastore for UDFs
> - my feedback from Navis' original patch
> - metastore udfs should not be considered native. This allows them to be added/removed
from registry
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 9ac540e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonFunctionInfo.java 93c15c0 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java 074255b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 08e1136 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 569c125 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/WindowFunctionInfo.java efecb05 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b900627 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java
31f906a 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java e43d39f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 22e5b47 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java af633cb 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 46f8052

>   ql/src/test/queries/clientnegative/drop_native_udf.q ae047bb 
>   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out c7405ed

>   ql/src/test/results/clientnegative/create_function_nonudf_class.q.out d0dd50a 
>   ql/src/test/results/clientnegative/drop_native_udf.q.out 9f0eaa5 
>   service/src/test/org/apache/hadoop/hive/service/TestHiveServerSessions.java fd38907

> 
> Diff: https://reviews.apache.org/r/26854/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jason Dere
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message