Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 192AE10D98 for ; Sun, 27 Apr 2014 22:31:11 +0000 (UTC) Received: (qmail 58393 invoked by uid 500); 27 Apr 2014 22:31:08 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 58322 invoked by uid 500); 27 Apr 2014 22:31:07 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 58313 invoked by uid 99); 27 Apr 2014 22:31:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Apr 2014 22:31:07 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mariolevitin@gmail.com designates 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Apr 2014 22:31:04 +0000 Received: by mail-ob0-f169.google.com with SMTP id uz6so6555716obc.0 for ; Sun, 27 Apr 2014 15:30:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=Cd3dhLZgOsrbTB1Tbp+f2iqGby4VJPobC7j3RBcOUmI=; b=r/SElMJ1G2fmgtgMP5EU1T/bIMnTv89X2EHMaBQ9uFHFjrGn1v/kYL/Au+7tSplyga 0HlozCWxHzAX+2iso6fGST9RhEIzo/wTSPz1Lwk+Kt1MXlSh7Aik2xx/fkCimHOrVay/ niM2go9W4n+0G/x1xJGvkrGpsRX15mjxVX2JnQpUGTQrmVUQ/lDBOlxux8+A6BFY7pi+ X9p31IDK31og/z8kboCF1qyuxRR57UopP/dhVQJG0iB+HYwRB1gtbEcK1EAtyrbeLwdJ GiypYyNVNpMUwfCFX6uyxe7JWFhsiDm3PQzhOp/2T+cReCLtt5+0l1lnjo/6EGeubSot jUDg== MIME-Version: 1.0 X-Received: by 10.182.105.1 with SMTP id gi1mr19386206obb.9.1398637841108; Sun, 27 Apr 2014 15:30:41 -0700 (PDT) Received: by 10.76.25.103 with HTTP; Sun, 27 Apr 2014 15:30:41 -0700 (PDT) Date: Mon, 28 Apr 2014 01:30:41 +0300 Message-ID: Subject: Understanding LogLikelihood Similarity From: Mario Levitin To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=e89a8ff1cf66b0518804f80dbfda X-Virus-Checked: Checked by ClamAV on apache.org --e89a8ff1cf66b0518804f80dbfda Content-Type: text/plain; charset=UTF-8 Hi, I've used LogLikelihood Similarity in user based nearest neighborhood collaborative filtering and it has given good results (better than the others). I have read the blog post by Ted Dunning ( http://tdunning.blogspot.com.tr/2008/03/surprise-and-coincidence.html) also looked at the implementation in Mahout. However, I still do not understand "why" this similarity metric works. I'm trying to give it a probabilistic interpretation in order to understand the logic behind. Any probabilistic interpretation should define random variables, events, etc. However, my attempts in this respect have been unsuccessful. Any help will be appreciated. Thanks --e89a8ff1cf66b0518804f80dbfda--