mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Anil <robin.a...@gmail.com>
Subject Re: factorization machines as new project
Date Thu, 11 Apr 2013 19:45:41 GMT
I would have folded them all as different feature ids in a single vector,
makes things a lot simpler and faster.

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Thu, Apr 11, 2013 at 11:19 AM, Gokhan Capan <gkhncpn@gmail.com> wrote:

> Hi Robin,
>
> If you are asking why they are arrays, it is because to save clients from
> concatenating multiple matrices to create the input.
>
> I am quoting from libFM paper<http://www.csie.ntu.edu.tw/~b97053/paper/Factorization%20Machines%20with%20libFM.pdf>:
> "For easier interpretation,
> the features are grouped into indicators for the active user (blue),
> active item (red), other movies rated
> by the same user (orange), the time in months (green), and the last movie
> rated (brown)."
>
> I thought a client would create multiple group of matrices, and he can
> just pass them all to the algorithm.
>
> Then the wModel is w parameters, it is still array of vectors for me to
> keep the indexing consistent, and vModel is the V parameters.
>
> Was that what you were asking?
>
>
> On Thu, Apr 11, 2013 at 6:44 PM, Robin Anil <robin.anil@gmail.com> wrote:
>
>> Comments away. I was a bit confused by the use of Vector[] for w1 and
>> Matrix[] for inputs.
>>
>> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>
>>
>> On Thu, Apr 11, 2013 at 10:00 AM, Gokhan Capan <gkhncpn@gmail.com> wrote:
>>
>>> Ted,
>>> Robin,
>>>
>>> Although I did not test on a dataset yet, recently I've been
>>> implementing Factorization Machines with SGD optimization.
>>>
>>> The initial implementation is at
>>> https://github.com/gcapan/mahout/tree/fm
>>>
>>> Would you guys consider to take a look so I can make it better and
>>> running?
>>>
>>>
>>>
>>> On Mon, Apr 1, 2013 at 8:45 PM, Nkechi Nnadi <nkechi.nnadi@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm long time lurker.  I would be interested in implementing these.  I
>>>> thought I would get my feet wet with contributing to wiki with tutorials
>>>> since I have used Mahout for recommendation and clustering in my
>>>> dissertation.  I have never contributed code before and I would love to
>>>> start now.
>>>>
>>>> -Nkechi
>>>>
>>>>
>>>> On Sun, Mar 31, 2013 at 1:14 PM, Robin Anil <robin.anil@gmail.com>
>>>> wrote:
>>>>
>>>> > FMs work really well for a whole range of things. Having implemented
>>>> them
>>>> > myself, I can extend my services as a reviewer if anyone is willing
to
>>>> > start on it.
>>>> >
>>>> > Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.
>>>> >
>>>> >
>>>> > On Sun, Mar 31, 2013 at 2:18 AM, Ted Dunning <ted.dunning@gmail.com>
>>>> > wrote:
>>>> >
>>>> > > Relative to Dan's recent mention of SOM as possible new project,
>>>> here are
>>>> > > slides from KDD Cup 2012 in which Stephen Rendle describes how
he
>>>> did
>>>> > using
>>>> > > a very straightforward implementation of Factorization Machines
>>>> [1,2].
>>>> > >
>>>> > >
>>>> > > FMs are interesting in the context of Mahout because they can be
>>>> used in
>>>> > a
>>>> > > wide variety of settings including recommendation and targeting
and
>>>> > because
>>>> > > they have very good performance on a number of tasks.
>>>> > >
>>>> > > I should mention that Robin was the one who first mentioned FMs
to
>>>> me.
>>>> > >
>>>> > > The KDD 2012 competition [3] is of interest in any case because
it
>>>> > provides
>>>> > > a large amount of realistic data for commercially important
>>>> problems.
>>>> > >
>>>> > > [1]
>>>> > >
>>>> > >
>>>> >
>>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/RendleSlides.pdf
>>>> > >
>>>> > > [2]
>>>> > >
>>>> > >
>>>> >
>>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/Rendle.pdf
>>>> > >
>>>> > > [3] http://www.kddcup2012.org/
>>>> > >
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Gokhan
>>>
>>
>>
>
>
> --
> Gokhan
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message