spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takeshi Yamamuro (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-22271) Describe results in "null" for the value of "mean" of a numeric variable
Date Fri, 13 Oct 2017 05:59:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203095#comment-16203095
] 

Takeshi Yamamuro commented on SPARK-22271:
------------------------------------------

More->Attach Files? btw, text file (csv or something) is better, I think.

> Describe results in "null" for the value of "mean" of a numeric variable
> ------------------------------------------------------------------------
>
>                 Key: SPARK-22271
>                 URL: https://issues.apache.org/jira/browse/SPARK-22271
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>         Environment: 
>            Reporter: Shafique Jamal
>            Priority: Minor
>
> Please excuse me if this issue was addressed already - I was unable to find it.
> Calling .describe().show() on my dataframe results in a value of null for the row "mean":
> {noformat}
> val foo = spark.read.parquet("decimalNumbers.parquet")        
> foo.select(col("numericvariable")).describe().show()
> foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)]
> +-------+--------------------+
> |summary|     numericvariable|
> +-------+--------------------+
> |  count|                 299|
> |   mean|                null|
> | stddev|  0.2376438793946738|
> |    min|0.037815489727642...|
> |    max|2.138189366554511...|
> {noformat}
> But all of the rows for this seem ok (I can attache a parquet file). When I round the
column, however, all is fine:
> {noformat}
> foo.select(bround(col("numericvariable"), 31)).describe().show()
> +-------+---------------------------+
> |summary|bround(numericvariable, 31)|
> +-------+---------------------------+
> |  count|                        299|
> |   mean|       0.139522503183236...|
> | stddev|         0.2376438793946738|
> |    min|       0.037815489727642...|
> |    max|       2.138189366554511...|
> +-------+---------------------------+
> {noformat}
> Rounding using 32 gives null also though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message