From user-return-18387-apmail-mahout-user-archive=mahout.apache.org@mahout.apache.org Thu Sep 5 08:15:56 2013 Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B7F68100AD for ; Thu, 5 Sep 2013 08:15:56 +0000 (UTC) Received: (qmail 76245 invoked by uid 500); 5 Sep 2013 08:15:53 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 76097 invoked by uid 500); 5 Sep 2013 08:15:52 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 76089 invoked by uid 99); 5 Sep 2013 08:15:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Sep 2013 08:15:52 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [109.239.58.177] (HELO helios.dhuebner.com) (109.239.58.177) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Sep 2013 08:15:45 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by helios.dhuebner.com (Postfix) with ESMTP id EA9A2602B for ; Thu, 5 Sep 2013 10:23:35 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at j8431.servers.jiffybox.net Received: from helios.dhuebner.com ([127.0.0.1]) by localhost (helios.dhuebner.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tSYohIP98mAb for ; Thu, 5 Sep 2013 10:23:35 +0200 (CEST) Received: from [172.20.10.9] (tmo-101-143.customers.d1-online.com [80.187.101.143]) by helios.dhuebner.com (Postfix) with ESMTPSA id 55CCF6019 for ; Thu, 5 Sep 2013 10:23:35 +0200 (CEST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Tweaking ALS models to filter out "highly related" items when an item has been purchased From: =?iso-8859-1?Q?Dominik_H=FCbner?= In-Reply-To: Date: Thu, 5 Sep 2013 10:15:26 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: user@mahout.apache.org X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org Just a quick a assumption, maybe I have not thought this through enough: 1. Users probably tend to compare products =3D> similar VIEWS 2. User as well might tend to PURCHASE accessory products, like the = laptop bag you mentioned May be you could filter out products that have a similarity computed = from the product views, but leave those similar, based on purchases, in = your recommendation set? Nevertheless, I guess this will be strongly depending on the domain the = data comes from. On Sep 5, 2013, at 10:07 AM, Nick Pentreath = wrote: > Hi all >=20 > Say I have a set of ecommerce data (views, purchases etc). I've built = my > model using implicit feedback ALS. Now, I want to add a little bit of > "smart filtering". >=20 > Filtering based on not recommending something that has been purchased = is > straightforward, but I'd like to also filter so as not to recommend = "highly > similar" items to someone who has purchased an item. >=20 > In other words, if someone has just purchased a laptop, then I'd like = to > not recommend other laptops. Ideally while still recommending = "related" > items such as laptop bags, mouse etc etc. (this is just an example). >=20 > Now, I could filter based on metadata tags like "category", but = assuming I > don't always have that data, then simplistically I have the option of > filtering out products based on those that have high cosine similarity = to > the purchased products. However, this risks filtering out "good" = similar > products (like the laptop bags) as well as the "bad" similar products. >=20 > I'm experimenting with building a second variant of the model that > effectively downweights "views" to near zero, hence leaving something = sort > of like a "purchased together" model variant. Then recommendations can = be > made using this model when a user purchases an item (or perhaps a = re-scorer > that is a weighted variant of model A and model B but that tends to = weight > model B - the purchased together model - higher) >=20 > Are there other mechanisms to tweak the ALS model such that it tends > towards recommending "related products" (but not "highly similar of = the > exact same narrow product type")? >=20 > Any other ideas about how best to go about this? >=20 > Many thanks > Nick