spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shagun Sodhani <sshagunsodh...@gmail.com>
Subject Re: Exception when using some aggregate operators
Date Tue, 27 Oct 2015 14:32:09 GMT
Will try in a while when I get back. I assume this applies to all functions
other than mean. Also countDistinct is defined along with all other SQL
functions. So I don't get "distinct is not part of function name" part.
On 27 Oct 2015 19:58, "Reynold Xin" <rxin@databricks.com> wrote:

> Try
>
> count(distinct columnane)
>
> In SQL distinct is not part of the function name.
>
> On Tuesday, October 27, 2015, Shagun Sodhani <sshagunsodhani@gmail.com>
> wrote:
>
>> Oops seems I made a mistake. The error message is : Exception in thread
>> "main" org.apache.spark.sql.AnalysisException: undefined function
>> countDistinct
>> On 27 Oct 2015 15:49, "Shagun Sodhani" <sshagunsodhani@gmail.com> wrote:
>>
>>> Hi! I was trying out some aggregate  functions in SparkSql and I noticed
>>> that certain aggregate operators are not working. This includes:
>>>
>>> approxCountDistinct
>>> countDistinct
>>> mean
>>> sumDistinct
>>>
>>> For example using countDistinct results in an error saying
>>> *Exception in thread "main" org.apache.spark.sql.AnalysisException:
>>> undefined function cosh;*
>>>
>>> I had a similar issue with cosh operator
>>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Exception-when-using-cosh-td14724.html>
>>> as well some time back and it turned out that it was not registered in the
>>> registry:
>>> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
>>>
>>>
>>> *I* *think it is the same issue again and would be glad to send over a
>>> PR if someone can confirm if this is an actual bug and not some mistake on
>>> my part.*
>>>
>>>
>>> Query I am using: SELECT countDistinct(`age`) as `data` FROM `table`
>>> Spark Version: 10.4
>>> SparkSql Version: 1.5.1
>>>
>>> I am using the standard example of (name, age) schema (though I am
>>> setting age as Double and not Int as I am trying out maths functions).
>>>
>>> The entire error stack can be found here <http://pastebin.com/G6YzQXnn>.
>>>
>>> Thanks!
>>>
>>

Mime
View raw message