spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Girardot <o.girar...@lateral-thoughts.com>
Subject Compute Median in Spark Dataframe
Date Tue, 02 Jun 2015 12:07:01 GMT
Hi everyone,
Is there any way to compute a median on a column using Spark's Dataframe. I
know you can use stats in a RDD but I'd rather stay within a dataframe.
Hive seems to imply that using ntile one can compute percentiles, quartiles
and therefore a median.
Does anyone have experience with this ?

Regards,

Olivier.

Mime
View raw message