flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tanguy racinet <tanraci...@gmail.com>
Subject Re: flink loop
Date Sun, 08 Feb 2015 11:38:25 GMT
Hi,

Thank you for you reply. It helped us solve the looping problems in a nicer
way.

We are struggling with some aspects of the cross function.
Still trying to implement the Apriori algorithm, we need to create
combinations of frequent itemSets.
Our problem is that the crossing gives us duplicates, for instance :(1, 2,
3, 4) and (2, 1, 4, 3) are equivalent for us so we are trying to find a way
to remove that kind of duplicate in our DataSet.

We already removed duplicates inside our combinations (1, 1, 2) => (1, 2).

We were thinking about using HashSet but they are not serializable and we
cannot use them inside the workflow, but only inside functions.

Can you think of any way to remove those duplicates ?

Thank you,
ᐧ

<http://eitictlabs-rennes.fr/>


*Racinet Tanguy*

*EIT ICT Labs Master School Student*
*Distributed Systems and Services*

Tel : +33 6 63 20 89 16 / +49 176 3749 8854
Mail : tanracinet@gmail.com

On Thu, Feb 5, 2015 at 8:51 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com>
wrote:

> Hi,
>
> I'm not familiar with the particular algorithm, but you can most probably
> use one of the two iterate operators in Flink.
>
> You can read a description and see some examples in the documentation:
> http://flink.apache.org/docs/0.8/programming_guide.html#iteration-operators
>
> Let us know if you have any questions!
>
> Cheers,
> V.
>
> On 5 February 2015 at 20:37, tanguy racinet <tanracinet@gmail.com> wrote:
>
>> Hi,
>>
>> We are trying to develop the Apriori algorith with the Flink for our Data
>> minning project.
>> In our understanding, Flink could handle loop within the workflow.
>> However, our knowledge is limited and we cannot find a nice way to do it.
>>
>> Here is the flow of my algorithm :
>> GenerateCandidates ----> CalculateFrequentItemSet
>> mapper                      ----> reducer
>>
>> We would like to use the reducer result as the mapper's entry for a
>> predefined number of times (loop x times).
>>
>> Is there any smart way to that with Flink. Or should we just copy paste
>> the loop x times ?
>>
>> Thank you,
>> <http://eitictlabs-rennes.fr/>
>>
>>
>> *Racinet Tanguy*
>>
>> *EIT ICT Labs Master School Student*
>> *Distributed Systems and Services*
>>
>> Tel : +33 6 63 20 89 16 / +49 176 3749 8854
>> Mail : tanracinet@gmail.com
>>
>> ᐧ
>>
>
>

Mime
View raw message