flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chiwan Park <chiwanp...@apache.org>
Subject Re: [flink-ml] How to use ParameterMap in predict method?
Date Tue, 30 Jun 2015 08:48:20 GMT
Thanks Till :)

I reimplemented my implementation using PredictDataSetOperation.

Regards,
Chiwan Park


> On Jun 29, 2015, at 7:41 PM, Till Rohrmann <till.rohrmann@gmail.com> wrote:
> 
> Hi Chiwan,
> 
> at the moment the single element PredictOperation only supports
> non-distributed models. This means that it expects the model to be a single
> element DataSet which can be broadcasted to the predict mappers.
> 
> If you need more flexibility, you can either extend the PredictOperation
> interface or you simply use the PredictDataSetOperation, where you have
> full control over what data flow you execute.
> 
> Cheers,
> Till
> ​
> 
> On Mon, Jun 29, 2015 at 12:16 PM, Chiwan Park <chiwanpark@apache.org> wrote:
> 
>> Thank you Till.
>> 
>> I have another question. Can I use a DataSet object as Model? In KNN, we
>> need
>> to DataSet given in fit operation.
>> 
>> But when I defined Model generic parameter to DataSet in PredictOperation,
>> the getModel method’s return type is DataSet[DataSet]. I’m confused with
>> this
>> situation.
>> 
>> If any advice about this to me, I will really appreciate.
>> 
>> 
>> Regards,
>> Chiwan Park
>> 
>>> On Jun 29, 2015, at 4:43 PM, Till Rohrmann <trohrmann@apache.org> wrote:
>>> 
>>> Hi Chiwan,
>>> 
>>> when you use the single element predict operation, you always have to
>>> implement the `getModel` method. There you have access to the resulting
>>> parameters and even to the instance to which the `PredictOperation`
>>> belongs. Within in this `getModel` method you can initialize all the
>>> information you need for the `predict` operation.
>>> 
>>> You can take a look at the `StandardScalerTransformOperation` [1] where
>> the
>>> mean and the std are set in the `getModel` method.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> [1]
>>> 
>> https://github.com/apache/flink/blob/master/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StandardScaler.scala#L197
>>> 
>>> On Sun, Jun 28, 2015 at 1:49 PM, Chiwan Park <chiwanpark@apache.org>
>> wrote:
>>> 
>>>> Hi, I’m implementing k-nearest-neighbors classification based flink-ml
>>>> structure.
>>>> 
>>>> In recent commit (7a7a2940 [1]), the pipeline is restructured by
>> dividing
>>>> predict operation
>>>> into case of a single element and case of data set. In case of data set,
>>>> parameter map is
>>>> given as a method parameter but in case of a single element there is no
>>>> method to access
>>>> parameter map.
>>>> 
>>>> But in k-nearest-neighbors classification, we need to know k in predict
>>>> method to select top
>>>> k values.
>>>> 
>>>> How can I solve this problem?
>>>> 
>>>> Regards,
>>>> Chiwan Park
>>>> 
>>>> [1]
>>>> 
>> https://github.com/apache/flink/commit/7a7a294033ef99c596e59f670e2e4ae9262f5c5f
>>>> 
>>>> 
>> 
>> 
>> 
>> 
>> 






Mime
View raw message