spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <men...@gmail.com>
Subject Re: Any interest in 'weighting' VectorTransformer which does component-wise scaling?
Date Tue, 27 Jan 2015 15:54:50 GMT
I would call it Scaler. You might want to add it to the spark.ml pipieline
api. Please check the spark.ml.HashingTF implementation. Note that this
should handle sparse vectors efficiently.

Hadamard and FFTs are quite useful. If you are intetested, make sure that
we call an FFT libary that is license-compatible with Apache.

-Xiangrui
On Jan 24, 2015 8:27 AM, "Octavian Geagla" <ogeagla@gmail.com> wrote:

> Hello,
>
> I found it useful to implement the  Hadamard Product
> <https://en.wikipedia.org/wiki/Hadamard_product_%28matrices%29http://>
>  as
> a VectorTransformer.  It can be applied to scale (by a constant) a certain
> dimension (column) of the data set.
>
> Since I've already implemented it and am using it, I thought I'd see if
> there's interest in this feature going in as Experimental.  I'm not sold on
> the name 'Weighter', either.
>
> Here's the current branch with the work (docs, impl, tests).
> <https://github.com/ogeagla/spark/compare/spark-mllib-weighting>
>
> The implementation was heavily inspired by those of StandardScalerModel and
> Normalizer.
>
> Thanks
> Octavian
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Any-interest-in-weighting-VectorTransformer-which-does-component-wise-scaling-tp10265.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message