spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From unk1102 <umesh.ka...@gmail.com>
Subject callUdf("percentile_approx",col("mycol"),lit(0.25)) does not compile spark 1.5.1 source but it does work in spark 1.5.1 bin
Date Sun, 18 Oct 2015 07:10:44 GMT
Hi starting new thread following old thread looks like code for compiling
callUdf("percentile_approx",col("mycol"),lit(0.25)) is not merged in spark
1.5.1 source but I dont understand why this function call works in Spark
1.5.1 spark-shell/bin. Please guide.

---------- Forwarded message ----------
From: "Ted Yu" <yuzhihong@gmail.com>
Date: Oct 14, 2015 3:26 AM
Subject: Re: How to calculate percentile of a column of DataFrame?
To: "Umesh Kacha" <umesh.kacha@gmail.com>
Cc: "Michael Armbrust" <michael@databricks.com>,
"&lt;Saif.A.Ellafi@wellsfargo.com&gt;" <Saif.A.Ellafi@wellsfargo.com>,
"user" <user@spark.apache.org>

I modified DataFrameSuite, in master branch, to call percentile_approx
instead of simpleUDF :

- deprecated callUdf in SQLContext
- callUDF in SQLContext *** FAILED ***
  org.apache.spark.sql.AnalysisException: undefined function
percentile_approx;
  at
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64)
  at
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry$$anonfun$2.apply(FunctionRegistry.scala:64)
  at scala.Option.getOrElse(Option.scala:120)
  at
org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:63)
  at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506)
  at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506)
  at
org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48)
  at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:505)
  at
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:502)
  at
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227)

SPARK-10671 is included.
For 1.5.1, I guess the absence of SPARK-10671 means that SparkSQL treats
percentile_approx as normal UDF.

Experts can correct me, if there is any misunderstanding.

Cheers



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/callUdf-percentile-approx-col-mycol-lit-0-25-does-not-compile-spark-1-5-1-source-but-it-does-work-inn-tp25111.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message