Return-Path: X-Original-To: apmail-mahout-dev-archive@www.apache.org Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4386010649 for ; Thu, 11 Apr 2013 19:46:30 +0000 (UTC) Received: (qmail 75981 invoked by uid 500); 11 Apr 2013 19:46:29 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 75881 invoked by uid 500); 11 Apr 2013 19:46:28 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 75854 invoked by uid 99); 11 Apr 2013 19:46:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Apr 2013 19:46:28 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of robin.anil@gmail.com designates 209.85.220.176 as permitted sender) Received: from [209.85.220.176] (HELO mail-vc0-f176.google.com) (209.85.220.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Apr 2013 19:46:22 +0000 Received: by mail-vc0-f176.google.com with SMTP id hf12so1586965vcb.7 for ; Thu, 11 Apr 2013 12:46:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=O+N3gBoeBDPuhPFWfnydsyyL1+vbCDHPF/f98Go7EAY=; b=yD8nRv7arjdOltCwMRwMhaEAeS1ynysgA7T0uoyaX/b9x4jqol08nVo2NEK8Z9kEBZ xfx95BWY/JhzLnDYdUTOW/vPRGhQpogHicjjChHcuyZI4V4bHNG/zr3+LnL8hBhuQmmG 6Wk+O3Y+fNld/ZED5S+2KbygfilSwrXFPFtJrV14F4NVS9iooR8kRHiepcg0vAmMv0FH SrpvdSxEf2xMqTA3Z7BjN4ozCMqa2I2jYUyDkFt748jVWbFl/qyMwT9D2Qmk9TTaqkhQ JR9jcqZ8sPwo3Lsmk//fViPrYEGLZKGhnACl1H9TExLJbDCE5LK0vrhFO2x3MJSt4FW+ d5LQ== X-Received: by 10.52.169.231 with SMTP id ah7mr3438199vdc.110.1365709561788; Thu, 11 Apr 2013 12:46:01 -0700 (PDT) MIME-Version: 1.0 Received: by 10.221.8.131 with HTTP; Thu, 11 Apr 2013 12:45:41 -0700 (PDT) In-Reply-To: References: From: Robin Anil Date: Thu, 11 Apr 2013 14:45:41 -0500 Message-ID: Subject: Re: factorization machines as new project To: Gokhan Capan Cc: Mahout-Dev , Ted Dunning Content-Type: multipart/alternative; boundary=089e0160c0ac4c375004da1b09ce X-Virus-Checked: Checked by ClamAV on apache.org --089e0160c0ac4c375004da1b09ce Content-Type: text/plain; charset=UTF-8 I would have folded them all as different feature ids in a single vector, makes things a lot simpler and faster. Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Thu, Apr 11, 2013 at 11:19 AM, Gokhan Capan wrote: > Hi Robin, > > If you are asking why they are arrays, it is because to save clients from > concatenating multiple matrices to create the input. > > I am quoting from libFM paper: > "For easier interpretation, > the features are grouped into indicators for the active user (blue), > active item (red), other movies rated > by the same user (orange), the time in months (green), and the last movie > rated (brown)." > > I thought a client would create multiple group of matrices, and he can > just pass them all to the algorithm. > > Then the wModel is w parameters, it is still array of vectors for me to > keep the indexing consistent, and vModel is the V parameters. > > Was that what you were asking? > > > On Thu, Apr 11, 2013 at 6:44 PM, Robin Anil wrote: > >> Comments away. I was a bit confused by the use of Vector[] for w1 and >> Matrix[] for inputs. >> >> Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. >> >> >> On Thu, Apr 11, 2013 at 10:00 AM, Gokhan Capan wrote: >> >>> Ted, >>> Robin, >>> >>> Although I did not test on a dataset yet, recently I've been >>> implementing Factorization Machines with SGD optimization. >>> >>> The initial implementation is at >>> https://github.com/gcapan/mahout/tree/fm >>> >>> Would you guys consider to take a look so I can make it better and >>> running? >>> >>> >>> >>> On Mon, Apr 1, 2013 at 8:45 PM, Nkechi Nnadi wrote: >>> >>>> Hello, >>>> >>>> I'm long time lurker. I would be interested in implementing these. I >>>> thought I would get my feet wet with contributing to wiki with tutorials >>>> since I have used Mahout for recommendation and clustering in my >>>> dissertation. I have never contributed code before and I would love to >>>> start now. >>>> >>>> -Nkechi >>>> >>>> >>>> On Sun, Mar 31, 2013 at 1:14 PM, Robin Anil >>>> wrote: >>>> >>>> > FMs work really well for a whole range of things. Having implemented >>>> them >>>> > myself, I can extend my services as a reviewer if anyone is willing to >>>> > start on it. >>>> > >>>> > Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. >>>> > >>>> > >>>> > On Sun, Mar 31, 2013 at 2:18 AM, Ted Dunning >>>> > wrote: >>>> > >>>> > > Relative to Dan's recent mention of SOM as possible new project, >>>> here are >>>> > > slides from KDD Cup 2012 in which Stephen Rendle describes how he >>>> did >>>> > using >>>> > > a very straightforward implementation of Factorization Machines >>>> [1,2]. >>>> > > >>>> > > >>>> > > FMs are interesting in the context of Mahout because they can be >>>> used in >>>> > a >>>> > > wide variety of settings including recommendation and targeting and >>>> > because >>>> > > they have very good performance on a number of tasks. >>>> > > >>>> > > I should mention that Robin was the one who first mentioned FMs to >>>> me. >>>> > > >>>> > > The KDD 2012 competition [3] is of interest in any case because it >>>> > provides >>>> > > a large amount of realistic data for commercially important >>>> problems. >>>> > > >>>> > > [1] >>>> > > >>>> > > >>>> > >>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/RendleSlides.pdf >>>> > > >>>> > > [2] >>>> > > >>>> > > >>>> > >>>> https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/Rendle.pdf >>>> > > >>>> > > [3] http://www.kddcup2012.org/ >>>> > > >>>> > >>>> >>> >>> >>> >>> -- >>> Gokhan >>> >> >> > > > -- > Gokhan > --089e0160c0ac4c375004da1b09ce--