From issues-return-133916-archive-asf-public=cust-asf.ponee.io@hive.apache.org Fri Aug 24 10:25:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id A4011180677 for ; Fri, 24 Aug 2018 10:25:05 +0200 (CEST) Received: (qmail 14918 invoked by uid 500); 24 Aug 2018 08:25:04 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 14909 invoked by uid 99); 24 Aug 2018 08:25:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2018 08:25:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 6AB2C1A104A for ; Fri, 24 Aug 2018 08:25:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id rupcHZyEMFrd for ; Fri, 24 Aug 2018 08:25:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8CF7C5F473 for ; Fri, 24 Aug 2018 08:25:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id B334BE0F39 for ; Fri, 24 Aug 2018 08:25:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 31EE023F98 for ; Fri, 24 Aug 2018 08:25:00 +0000 (UTC) Date: Fri, 24 Aug 2018 08:25:00 +0000 (UTC) From: "Zhihua Deng (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (HIVE-20441) NPE in ExprNodeGenericFuncDesc when hive.allow.udf.load.on.demand is set to true MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-20441?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1659= 1241#comment-16591241 ]=20 Zhihua Deng edited comment on HIVE-20441 at 8/24/18 8:24 AM: ------------------------------------------------------------- Nice work!=C2=A0 a few questions: # the=C2=A0registerToSessionRegistry method has registered the function in= to the session cache, why should called again by=C2=A0addFunction(functionN= ame, function) , this may override the udf=C2=A0 registered recently, what = if the same query executed again in the same session?=C2=A0 check=C2=A0newF= unction !=3D null && isNative before add function? # the=C2=A0registerToSession set to true and if the newFunction returned i= s null, this will goes on to next=C2=A0addFunction(functionName, function);= . In this case, does it make any sense? in which case should we do like thi= s?=C2=A0 thanks was (Author: dengzh): Nice work!=C2=A0 a few questions: # the=C2=A0registerToSessionRegistry method has registered the function in= to the session cache, why should called again by=C2=A0addFunction(functionN= ame, function) , this may override the udf=C2=A0 registered recently, what = if the same query executed again in the same session? # the=C2=A0registerToSession set to true and if the newFunction returned i= s null, this will goes on to next=C2=A0addFunction(functionName, function);= . In this case, does it make any sense? in which case should we do like thi= s? thanks > NPE in ExprNodeGenericFuncDesc when hive.allow.udf.load.on.demand is set= to true > -------------------------------------------------------------------------= -------- > > Key: HIVE-20441 > URL: https://issues.apache.org/jira/browse/HIVE-20441 > Project: Hive > Issue Type: Bug > Components: CLI, HiveServer2 > Affects Versions: 1.2.1, 2.3.3 > Reporter: Hui Huang > Assignee: Hui Huang > Priority: Major > Fix For: 2.3.3 > > Attachments: HIVE-20441.1.patch, HIVE-20441.patch > > > When hive.allow.udf.load.on.demand is set to true and hiveserver2 has bee= n started, the new created function from other clients or hiveserver2 will = be loaded from the metastore at the first time.=20 > When the udf is used in where clause, we got a NPE like: > {code:java} > Error executing statement: > org.apache.hive.service.cli.HiveSQLException: Error while compiling state= ment: FAILED: NullPointerException null > at org.apache.hive.service.cli.operation.Operation.toSQLException= (Operation.java:380) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQL= Operation.java:206) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.operation.SQLOperation.runInternal= (SQLOperation.java:290) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.operation.Operation.run(Operation.= java:320) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.session.HiveSessionImpl.executeSta= tementInternal(HiveSessionImpl.java:530) ~[hive-service-2.3.4-SNAPSHOT.jar:= 2.3.4-SNAP > SHOT] > at org.apache.hive.service.cli.session.HiveSessionImpl.executeSta= tementAsync(HiveSessionImpl.java:517) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3= .4-SNAPSHO > T] > at org.apache.hive.service.cli.CLIService.executeStatementAsync(C= LIService.java:310) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteSta= tement(ThriftCLIService.java:542) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-S= NAPSHOT] > at org.apache.hive.service.rpc.thrift.TCLIService$Processor$Execu= teStatement.getResult(TCLIService.java:1437) ~[hive-exec-2.3.4-SNAPSHOT.jar= :2.3.4-SNA > PSHOT] > at org.apache.hive.service.rpc.thrift.TCLIService$Processor$Execu= teStatement.getResult(TCLIService.java:1422) ~[hive-exec-2.3.4-SNAPSHOT.jar= :2.3.4-SNA > PSHOT] > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java= :39) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:3= 9) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TS= etIpAddressProcessor.java:57) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPS= HOT] > at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(T= ThreadPoolServer.java:286) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolEx= ecutor.java:1142) [?:1.8.0_77] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE= xecutor.java:617) [?:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77] > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInst= ance(ExprNodeGenericFuncDesc.java:236) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4= -SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultEx= prProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104) ~[hi= ve-exec-2. > 3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultEx= prProcessor.process(TypeCheckProcFactory.java:1359) ~[hive-exec-2.3.4-SNAPS= HOT.jar:2. > 3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(D= efaultRuleDispatcher.java:90) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT= ] > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndRe= turn(DefaultGraphWalker.java:105) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAP= SHOT] > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(Defa= ultGraphWalker.java:89) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(Expression= Walker.java:76) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(= DefaultGraphWalker.java:120) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNo= de(TypeCheckProcFactory.java:229) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAP= SHOT] > at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNo= de(TypeCheckProcFactory.java:176) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAP= SHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNod= eDesc(SemanticAnalyzer.java:11613) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA= PSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDe= sc(SemanticAnalyzer.java:11568) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSH= OT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDe= sc(SemanticAnalyzer.java:11536) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSH= OT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan= (SemanticAnalyzer.java:3303) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan= (SemanticAnalyzer.java:3283) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(S= emanticAnalyzer.java:9592) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(Seman= ticAnalyzer.java:10549) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(Seman= ticAnalyzer.java:10427) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(Sem= anticAnalyzer.java:11125) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeIntern= al(SemanticAnalyzer.java:11138) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSH= OT] > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeIntern= al(SemanticAnalyzer.java:10807) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSH= OT] > at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(B= aseSemanticAnalyzer.java:258) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT= ] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) ~[hi= ve-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1= 317) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java= :1295) ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQL= Operation.java:204) ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT] > {code} > =20 > The code to get udf from metastore is: > {code:java} > private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionNa= me, HiveConf conf) { > try { > String[] parts =3D FunctionUtils.getQualifiedFunctionNameParts(func= tionName); > Function func =3D Hive.get(conf).getFunction(parts[0].toLowerCase()= , parts[1]); > if (func =3D=3D null) { > return null; > } > // Found UDF in metastore - now add it to the function registry. > FunctionInfo fi =3D registerPermanentFunction(functionName, func.ge= tClassName(), true, > FunctionTask.toFunctionResource(func.getResourceUris())); > if (fi =3D=3D null) { > LOG.error(func.getClassName() + " is not a valid UDF class and wa= s not registered"); > return null; > } > return fi; > } catch (Throwable e) { > LOG.info("Unable to look up " + functionName + " in metastore", e); > } > return null; > } > {code} > =20 > After getting the function, the function is registered to permanent funct= ion list through method 'registerPermanentFunction'. > {code:java} > public FunctionInfo registerPermanentFunction(String functionName, > String className, boolean registerToSession, FunctionResource... re= sources) { > FunctionInfo function =3D new FunctionInfo(functionName, className, r= esources); > // register to session first for backward compatibility > if (registerToSession) { > String qualifiedName =3D FunctionUtils.qualifyFunctionName( > functionName, SessionState.get().getCurrentDatabase().toLowerCa= se()); > if (registerToSessionRegistry(qualifiedName, function) !=3D null) { > addFunction(functionName, function); > return function; > } > } else { > addFunction(functionName, function); > } > return null; > } > {code} > And the variable registerToSession is true, so the object 'function' wil= l be returned. But the genericUDF field of the returned function is null wh= ich cause the error.=20 > We should return the result of the method registerToSessionRegistry retur= ned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)