flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: flink loop
Date Sun, 08 Feb 2015 12:07:33 GMT
Hi,

you could apply a filter operation after the cross operation which filters
all combinations out which are not in ascending order.

Cheers,

Till

On Sun, Feb 8, 2015 at 12:38 PM, tanguy racinet <tanracinet@gmail.com>
wrote:

> Hi,
>
> Thank you for you reply. It helped us solve the looping problems in a
> nicer way.
>
> We are struggling with some aspects of the cross function.
> Still trying to implement the Apriori algorithm, we need to create
> combinations of frequent itemSets.
> Our problem is that the crossing gives us duplicates, for instance :(1, 2,
> 3, 4) and (2, 1, 4, 3) are equivalent for us so we are trying to find a way
> to remove that kind of duplicate in our DataSet.
>
> We already removed duplicates inside our combinations (1, 1, 2) => (1, 2).
>
> We were thinking about using HashSet but they are not serializable and we
> cannot use them inside the workflow, but only inside functions.
>
> Can you think of any way to remove those duplicates ?
>
> Thank you,
> ᐧ
>
> <http://eitictlabs-rennes.fr/>
>
>
> *Racinet Tanguy*
>
> *EIT ICT Labs Master School Student*
> *Distributed Systems and Services*
>
> Tel : +33 6 63 20 89 16 / +49 176 3749 8854
> Mail : tanracinet@gmail.com
>
> On Thu, Feb 5, 2015 at 8:51 PM, Vasiliki Kalavri <
> vasilikikalavri@gmail.com> wrote:
>
>> Hi,
>>
>> I'm not familiar with the particular algorithm, but you can most probably
>> use one of the two iterate operators in Flink.
>>
>> You can read a description and see some examples in the documentation:
>>
>> http://flink.apache.org/docs/0.8/programming_guide.html#iteration-operators
>>
>> Let us know if you have any questions!
>>
>> Cheers,
>> V.
>>
>> On 5 February 2015 at 20:37, tanguy racinet <tanracinet@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We are trying to develop the Apriori algorith with the Flink for our
>>> Data minning project.
>>> In our understanding, Flink could handle loop within the workflow.
>>> However, our knowledge is limited and we cannot find a nice way to do it.
>>>
>>> Here is the flow of my algorithm :
>>> GenerateCandidates ----> CalculateFrequentItemSet
>>> mapper                      ----> reducer
>>>
>>> We would like to use the reducer result as the mapper's entry for a
>>> predefined number of times (loop x times).
>>>
>>> Is there any smart way to that with Flink. Or should we just copy paste
>>> the loop x times ?
>>>
>>> Thank you,
>>> <http://eitictlabs-rennes.fr/>
>>>
>>>
>>> *Racinet Tanguy*
>>>
>>> *EIT ICT Labs Master School Student*
>>> *Distributed Systems and Services*
>>>
>>> Tel : +33 6 63 20 89 16 / +49 176 3749 8854
>>> Mail : tanracinet@gmail.com
>>>
>>> ᐧ
>>>
>>
>>
>

Mime
View raw message