hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-20441) NPE in GenericUDF when hive.allow.udf.load.on.demand is set to true
Date Mon, 20 Jul 2020 10:11:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20441?focusedWorklogId=461005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-461005
]

ASF GitHub Bot logged work on HIVE-20441:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jul/20 10:10
            Start Date: 20/Jul/20 10:10
    Worklog Time Spent: 10m 
      Work Description: dengzhhu653 commented on a change in pull request #1242:
URL: https://github.com/apache/hive/pull/1242#discussion_r457248607



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java
##########
@@ -293,9 +293,10 @@ public FunctionInfo registerPermanentFunction(String functionName,
     if (registerToSession) {
       String qualifiedName = FunctionUtils.qualifyFunctionName(
           functionName, SessionState.get().getCurrentDatabase().toLowerCase());
-      if (registerToSessionRegistry(qualifiedName, function) != null) {
+      FunctionInfo newFunction = registerToSessionRegistry(qualifiedName, function);

Review comment:
       The call ```FunctionRegistry.getFunctionInfo(String functionName)``` will make HS2
will lookup the function from MetaStore when the function does not find in the session or
system registry with hive.allow.udf.load.on.demand enabled. If the function is found, a FunctionInfo
created by ```new FunctionInfo(functionName, className, resources)``` will be returned, but
the genericUDF field of the FunctionInfo is null, https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java#L67-L74
. So when TypeCheckProcFactory.DefaultExprProcessor gets function expr from AstNode, https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java#L935-L948,
 The genericUDF got from ```GenericUDF genericUDF = fi.getGenericUDF();```  is null,  if the
genericUDF is used to create function expr desc afterwards, a npe will be thrown.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 461005)
    Time Spent: 1.5h  (was: 1h 20m)

> NPE in GenericUDF  when hive.allow.udf.load.on.demand is set to true
> --------------------------------------------------------------------
>
>                 Key: HIVE-20441
>                 URL: https://issues.apache.org/jira/browse/HIVE-20441
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI, HiveServer2
>    Affects Versions: 1.2.1, 2.3.3
>            Reporter: Hui Huang
>            Assignee: Zhihua Deng
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20441.1.patch, HIVE-20441.2.patch, HIVE-20441.3.patch, HIVE-20441.4.patch,
HIVE-20441.patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been started, the
new created function from other clients or hiveserver2 will be loaded from the metastore at
the first time. 
> When the udf is used in where clause, we got a NPE like:
> {code:java}
> Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED:
NullPointerException null
>         at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
> SHOT]
>         at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
> T]
>         at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
>         at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_77]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_77]
>         at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104)
~[hive-exec-2.
> 3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.
> 3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:229)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:176)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11613)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11568)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11536)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3303)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3283)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9592)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10549)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10427)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11125)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10807)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1295) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> {code}
>  
> The code to get udf from metastore is:
> {code:java}
> private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionName, HiveConf
conf) {
>     try {
>       String[] parts = FunctionUtils.getQualifiedFunctionNameParts(functionName);
>       Function func = Hive.get(conf).getFunction(parts[0].toLowerCase(), parts[1]);
>       if (func == null) {
>         return null;
>       }
>       // Found UDF in metastore - now add it to the function registry.
>       FunctionInfo fi = registerPermanentFunction(functionName, func.getClassName(),
true,
>           FunctionTask.toFunctionResource(func.getResourceUris()));
>       if (fi == null) {
>         LOG.error(func.getClassName() + " is not a valid UDF class and was not registered");
>         return null;
>       }
>       return fi;
>     } catch (Throwable e) {
>       LOG.info("Unable to look up " + functionName + " in metastore", e);
>     }
>     return null;
>   }
> {code}
>  
> After getting the function, the function is registered to permanent function list through
method 'registerPermanentFunction'.
> {code:java}
> public FunctionInfo registerPermanentFunction(String functionName,
>       String className, boolean registerToSession, FunctionResource... resources) {
>     FunctionInfo function = new FunctionInfo(functionName, className, resources);
>     // register to session first for backward compatibility
>     if (registerToSession) {
>       String qualifiedName = FunctionUtils.qualifyFunctionName(
>           functionName, SessionState.get().getCurrentDatabase().toLowerCase());
>       if (registerToSessionRegistry(qualifiedName, function) != null) {
>         addFunction(functionName, function);
>         return function;
>       }
>     } else {
>         addFunction(functionName, function);
>     }
>     return null;
>   }
> {code}
> And the variable registerToSession is true, so  the object 'function' will be returned.
But the genericUDF field of the returned function is null which cause the error. 
> We should return the result of the method registerToSessionRegistry returned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message