flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <till.rohrm...@gmail.com>
Subject Re: [flink-ml] How to use ParameterMap in predict method?
Date Mon, 29 Jun 2015 10:41:58 GMT
Hi Chiwan,

at the moment the single element PredictOperation only supports
non-distributed models. This means that it expects the model to be a single
element DataSet which can be broadcasted to the predict mappers.

If you need more flexibility, you can either extend the PredictOperation
interface or you simply use the PredictDataSetOperation, where you have
full control over what data flow you execute.

Cheers,
Till
​

On Mon, Jun 29, 2015 at 12:16 PM, Chiwan Park <chiwanpark@apache.org> wrote:

> Thank you Till.
>
> I have another question. Can I use a DataSet object as Model? In KNN, we
> need
> to DataSet given in fit operation.
>
> But when I defined Model generic parameter to DataSet in PredictOperation,
> the getModel method’s return type is DataSet[DataSet]. I’m confused with
> this
> situation.
>
> If any advice about this to me, I will really appreciate.
>
>
> Regards,
> Chiwan Park
>
> > On Jun 29, 2015, at 4:43 PM, Till Rohrmann <trohrmann@apache.org> wrote:
> >
> > Hi Chiwan,
> >
> > when you use the single element predict operation, you always have to
> > implement the `getModel` method. There you have access to the resulting
> > parameters and even to the instance to which the `PredictOperation`
> > belongs. Within in this `getModel` method you can initialize all the
> > information you need for the `predict` operation.
> >
> > You can take a look at the `StandardScalerTransformOperation` [1] where
> the
> > mean and the std are set in the `getModel` method.
> >
> > Cheers,
> > Till
> >
> > [1]
> >
> https://github.com/apache/flink/blob/master/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StandardScaler.scala#L197
> >
> > On Sun, Jun 28, 2015 at 1:49 PM, Chiwan Park <chiwanpark@apache.org>
> wrote:
> >
> >> Hi, I’m implementing k-nearest-neighbors classification based flink-ml
> >> structure.
> >>
> >> In recent commit (7a7a2940 [1]), the pipeline is restructured by
> dividing
> >> predict operation
> >> into case of a single element and case of data set. In case of data set,
> >> parameter map is
> >> given as a method parameter but in case of a single element there is no
> >> method to access
> >> parameter map.
> >>
> >> But in k-nearest-neighbors classification, we need to know k in predict
> >> method to select top
> >> k values.
> >>
> >> How can I solve this problem?
> >>
> >> Regards,
> >> Chiwan Park
> >>
> >> [1]
> >>
> https://github.com/apache/flink/commit/7a7a294033ef99c596e59f670e2e4ae9262f5c5f
> >>
> >>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message