systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Surve <ac...@yahoo.com.INVALID>
Subject Re: Weighted Statistical Estimates
Date Sun, 19 Feb 2017 06:01:09 GMT
+1 ------------------     Arvind Surve     Spark Technology Center     http://www.spark.tc/

      From: Felix Schüler <fschueler@posteo.de>
 To: dev@systemml.incubator.apache.org 
 Sent: Saturday, February 18, 2017 9:42 PM
 Subject: Re: Weighted Statistical Estimates
   
Sounds good!

-Felix

On 18.02.2017 21:20, Matthias Boehm wrote:
> Going toward to our 1.0 release, I'd like to create consistency across our
> weighted statistics. Conceptually, theses weights represent frequency
> counts, i.e., multiplicities of input values.
>
> So far, our documentation does not state any restrictions on these weights
> but some runtime operations require integer data (I), while others allow
> arbitrary floating point data as indicated below:
>
> * moment
> * cov
> * aggregate
> * table
> * median (I)
> * quantile (I)
> * interQuartileMean (I)
>
> This can lead to unexpected errors as shown by recent issues such as
> SYSTEMML-1265. Looking back to R and its packages like Hmisc or reldist, it
> turns out that they all allow arbitrary weights.
>
> So, relaxing any restrictions of integer weights seems like the right
> choice. As this changes the external behavior - albeit in a generalizing
> manner - we should make this change now. If you have any concerns, let me
> know.
>
> Regards,
> Matthias
>



   
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message