No, you don't get 100% accurracy in this case. You don't even want that, it
would be a severe case of overfitting. You would have that only in the case
that your dataset is linearly separable or separable with a finely tuned
kernel, but in that case SVM would be an overkill and more traditional
methodologies would suffice.
Flink SVM's implementation for binary classification returns "1" as
default label for the "negative" class. It's a rather raw implementation so
it's better to use it exclusively if you have a clear idea of the
underlying process, otherwise you could have problems if you treat it as a
black box like you would do with more mature ML libraries.
20160930 22:52 GMT+02:00 Kürşat Kurt <kursat@kursatkurt.com>:
> Hi;
>
>
>
> I am trying to train and predict with the same set. I expect that accuracy
> shuld be %100, am i wrong?
>
> If i try to predict with the same set; it is failing, also it classifies
> like “1” which is not in the training set.
>
> What is wrong with this code?
>
>
>
> *Code:*
>
> *def* main(args: Array[String]): Unit = {
>
> *val* env = ExecutionEnvironment.getExecutionEnvironment
>
> *val* training = Seq(
>
> *new* LabeledVector(1.0, *new* SparseVector(10, Array(0, 2, 3),
> Array(1.0, 1.0, 1.0))),
>
> *new* LabeledVector(1.0, *new* SparseVector(10, Array(0, 1, 5, 9),
> Array(1.0, 1.0, 1.0, 1.0))),
>
> *new* LabeledVector(0.0, *new* SparseVector(10, Array(0, 2), Array(
> 0.0, 1.0))),
>
> *new* LabeledVector(0.0, *new* SparseVector(10, Array(0), Array(0.0
> ))),
>
> *new* LabeledVector(0.0, *new* SparseVector(10, Array(0, 2), Array(
> 0.0, 1.0))),
>
> *new* LabeledVector(0.0, *new* SparseVector(10, Array(0), Array(0.0
> ))))
>
>
>
> *val* trainingDS = env.fromCollection(training)
>
> *val* testingDS = env.fromCollection(training)
>
> *val* svm = *new* SVM().setBlocks(env.getParallelism)
>
> svm.fit(trainingDS)
>
> *val* predictions = svm.evaluate(testingDS.map(x => (x.vector, x.label
> )))
>
> predictions.print()
>
>
>
> }
>
>
>
> *Output:*
>
> (1.0,1.0)
>
> (1.0,1.0)
>
> (0.0,1.0)
>
> (0.0,1.0)
>
> (0.0,1.0)
>
> (0.0,1.0)
>
