flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas FOURNIER <thomasfournier...@gmail.com>
Subject Re: FlinkML - Evaluate function should manage LabeledVector
Date Wed, 19 Oct 2016 20:30:38 GMT
Hi,

Two questions:

1- I was thinking of doing this:

implicit def evaluateLabeledVector[T <: LabeledVector] = {

  new EvaluateDataSetOperation[SVM,T,Double]() {

    override def evaluateDataSet(instance: SVM, evaluateParameters:
ParameterMap, testing: DataSet[T]): DataSet[(Double, Double)] = {
      val predictor = ...
      testing.map(l => (l.label, predictor.predict(l.vector)))

    }
  }
}

How can I access to my predictor object (predictor has type
PredictOperation[SVM, DenseVector, T, Double]) ?

2- My first idea was to develop a predictOperation[T <: LabeledVector]
so that I could use implicit def defaultEvaluateDatasetOperation

to get an EvaluateDataSetOperationObject. Is it also valid or not ?

Thanks
Regards

Thomas







2016-10-19 16:26 GMT+02:00 Theodore Vasiloudis <
theodoros.vasiloudis@gmail.com>:

> Hello Thomas,
>
> since you are calling evaluate here, you should be creating an
> EvaluateDataSet operation that works with LabeledVector, I see you are
> creating a new PredictOperation.
>
> On Wed, Oct 19, 2016 at 3:05 PM, Thomas FOURNIER <
> thomasfournier314@gmail.com> wrote:
>
> > Hi,
> >
> > I'd like to improve SVM evaluate function so that it can use
> LabeledVector
> > (and not only Vector).
> > Indeed, what is done in test is the following (data is a
> > DataSet[LabeledVector]):
> >
> > val test = data.map(l => (l.vector, l.label))
> > svm.evaluate(test)
> >
> > We would like to do:
> > sm.evaluate(data)
> >
> >
> > Adding this "new" code:
> >
> > implicit def predictLabeledPoint[T <: LabeledVector] = {
> >  new PredictOperation  ...
> > }
> >
> > gives me a predictOperation that should be used with
> > defaultEvaluateDataSetOperation
> > with the correct signature (ie with T <: LabeledVector and not T<:
> Vector).
> >
> > Nonetheless, tests are failing:
> >
> >
> > it should "predict with LabeledDataPoint" in {
> >
> >   val env = ExecutionEnvironment.getExecutionEnvironment
> >
> >   val svm = SVM().
> >     setBlocks(env.getParallelism).
> >     setIterations(100).
> >     setLocalIterations(100).
> >     setRegularization(0.002).
> >     setStepsize(0.1).
> >     setSeed(0)
> >
> >   val trainingDS = env.fromCollection(Classification.trainingData)
> >   svm.fit(trainingDS)
> >   val predictionPairs = svm.evaluate(trainingDS)
> >
> >   ....
> > }
> >
> > There is no PredictOperation defined for
> > org.apache.flink.ml.classification.SVM which takes a
> > DataSet[org.apache.flink.ml.common.LabeledVector] as input.
> > java.lang.RuntimeException: There is no PredictOperation defined for
> > org.apache.flink.ml.classification.SVM which takes a
> > DataSet[org.apache.flink.ml.common.LabeledVector] as input.
> >
> >
> >
> > Thanks
> >
> > Regards
> > Thomas
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message