Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D18089138 for ; Tue, 3 Jan 2012 20:26:59 +0000 (UTC) Received: (qmail 14792 invoked by uid 500); 3 Jan 2012 20:26:58 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 14747 invoked by uid 500); 3 Jan 2012 20:26:58 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 14739 invoked by uid 99); 3 Jan 2012 20:26:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jan 2012 20:26:58 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Manuel.Blechschmidt@gmx.de designates 213.165.64.23 as permitted sender) Received: from [213.165.64.23] (HELO mailout-de.gmx.net) (213.165.64.23) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 03 Jan 2012 20:26:54 +0000 Received: (qmail invoked by alias); 03 Jan 2012 20:26:32 -0000 Received: from p5DC47D72.dip.t-dialin.net (EHLO [192.168.2.101]) [93.196.125.114] by mail.gmx.net (mp063) with SMTP; 03 Jan 2012 21:26:32 +0100 X-Authenticated: #2167237 X-Provags-ID: V01U2FsdGVkX1/KgOodt1DKEKOuaZXl+yb9CLnwBmY5VepDYK5KvO 0SlD6JQlwWbRzK Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Purchase prediction From: Manuel Blechschmidt In-Reply-To: Date: Tue, 3 Jan 2012 21:26:31 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <74911649-7CCF-4766-BFDC-6EFD4262EAD2@gmx.de> To: user@mahout.apache.org X-Mailer: Apple Mail (2.1084) X-Y-GMX-Trusted: 0 Hi Mike, actually it is a very tough research task to make predictions in real = time. I would expect that you can tune hidden markov models to work in semi = real time. Further if you have a trained model you can use this model in real time. = The big question is how often can and should you rebuild your model. = Further the question is how much computation time do you want to spend = for every customer? Perhaps the KDD Cup from 2000 is valueable: http://www.kdd.org/kddcup/index.php?section=3D2000&method=3Dresult Tasks: Given a set of page views, will the visitor view another page on the = site or will the visitor leave? Given a set of page views, which product brand will the visitor view in = the remainder of the session? ... Agrawal et al. described a method to semi real time recommendations for = news stories: Fast Online Learning through Offline Initialization for Time-sensitive = Recommendation = http://users.cs.fiu.edu/~lzhen001/activities/KDD_USB_key_2010/docs/p703.pd= f Hope that helps. If you have any results I would be interested in them. /Manuel On 03.01.2012, at 20:59, Mike Spreitzer wrote: > I suspect the original request was concerned with --- and I, on my = own, am=20 > concerned with --- a scenario in which it is desired to be able to = quickly=20 > make predictions based on very recent data. Thus, approaches that=20 > occasionally take a lot of time to build a model are non-solutions. = Are=20 > there solutions for my scenario in what you mentioned, or elsewhere? >=20 > Thanks, > Mike >=20 >=20 >=20 > From: Manuel Blechschmidt > To: user@mahout.apache.org > Date: 01/03/2012 02:40 PM > Subject: Re: Purchase prediction >=20 >=20 >=20 > Hello Nishan, > you can use the recommender approaches with the boolean reference = model. >=20 > You can use IRStatistics (Precision, Recall, F-Measure) to benchmark = your=20 > results. > = https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentati= on >=20 >=20 > Further you could also use the hidden markov model to predict=20 > probabilities of next purchases. > http://isabel-drost.de/hadoop/slides/HMM.pdf > https://issues.apache.org/jira/browse/MAHOUT-396 >=20 > There are some papers describing how to combine some of these methods: >=20 > Rendle. et. al presented a paper using a combination of both: > Factorizing Personalized Markov Chains for Next-Basket Recommendation > = http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.p= df >=20 >=20 > In my opinion some seasonal models could also help to better predict = next=20 > purchases. >=20 > There is currently an resolved enhancement request for 0.6 making=20 > evaluation for a use case like yours better: > https://issues.apache.org/jira/browse/MAHOUT-906 >=20 > If you have further questions feel free to ask. >=20 > /Manuel >=20 > On 03.01.2012, at 19:02, Nishant Chandra wrote: >=20 >> Hi, >>=20 >> I am trying to predict shopper purchase and non-purchase intention in >> E-Commerce context. I am more interested in finding the later. >> A near-real time approach will be great. So given a sequence of pages >> a shopper views, I would like the algorithm to predict the intention. >>=20 >> Any algorithms in Mahout or otherwise that can help? >>=20 >> Thanks, >> Nishant >=20 > --=20 > Manuel Blechschmidt > Dortustr. 57 > 14467 Potsdam > Mobil: 0173/6322621 > Twitter: http://twitter.com/Manuel_B >=20 >=20 --=20 Manuel Blechschmidt Dortustr. 57 14467 Potsdam Mobil: 0173/6322621 Twitter: http://twitter.com/Manuel_B