systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <npan...@us.ibm.com>
Subject Re: Weighted Statistical Estimates
Date Sun, 19 Feb 2017 06:04:56 GMT
+1

Thanks,

Niketan 

> On Feb 18, 2017, at 10:01 PM, Arvind Surve <acs_s@yahoo.com.INVALID> wrote:
> 
> +1 ------------------    Arvind Surve     Spark Technology Center     http://www.spark.tc/
> 
>      From: Felix Schüler <fschueler@posteo.de>
> To: dev@systemml.incubator.apache.org 
> Sent: Saturday, February 18, 2017 9:42 PM
> Subject: Re: Weighted Statistical Estimates
> 
> Sounds good!
> 
> -Felix
> 
>> On 18.02.2017 21:20, Matthias Boehm wrote:
>> Going toward to our 1.0 release, I'd like to create consistency across our
>> weighted statistics. Conceptually, theses weights represent frequency
>> counts, i.e., multiplicities of input values.
>> 
>> So far, our documentation does not state any restrictions on these weights
>> but some runtime operations require integer data (I), while others allow
>> arbitrary floating point data as indicated below:
>> 
>> * moment
>> * cov
>> * aggregate
>> * table
>> * median (I)
>> * quantile (I)
>> * interQuartileMean (I)
>> 
>> This can lead to unexpected errors as shown by recent issues such as
>> SYSTEMML-1265. Looking back to R and its packages like Hmisc or reldist, it
>> turns out that they all allow arbitrary weights.
>> 
>> So, relaxing any restrictions of integer weights seems like the right
>> choice. As this changes the external behavior - albeit in a generalizing
>> manner - we should make this change now. If you have any concerns, let me
>> know.
>> 
>> Regards,
>> Matthias
>> 
> 
> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message