Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7A9DA200CAB for ; Sun, 4 Jun 2017 06:11:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 790D8160BD1; Sun, 4 Jun 2017 04:11:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F2135160BCD for ; Sun, 4 Jun 2017 06:11:39 +0200 (CEST) Received: (qmail 35600 invoked by uid 500); 4 Jun 2017 04:11:39 -0000 Mailing-List: contact user-help@predictionio.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@predictionio.incubator.apache.org Delivered-To: mailing list user@predictionio.incubator.apache.org Received: (qmail 35588 invoked by uid 99); 4 Jun 2017 04:11:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Jun 2017 04:11:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 941D8C0362 for ; Sun, 4 Jun 2017 04:11:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.48 X-Spam-Level: ** X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=occamsmachete-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id TOpegs1GLR3E for ; Sun, 4 Jun 2017 04:11:35 +0000 (UTC) Received: from mail-qt0-f195.google.com (mail-qt0-f195.google.com [209.85.216.195]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 1AE765F2A8 for ; Sun, 4 Jun 2017 04:11:35 +0000 (UTC) Received: by mail-qt0-f195.google.com with SMTP id w1so9709469qtg.0 for ; Sat, 03 Jun 2017 21:11:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=occamsmachete-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=EtW9BsqHu49N8dlu34iCKgY+jyHS/L68SYg58ApKBxc=; b=JxzDIZThE6JTILLsr/xYSEgcZGzqqrHWB6yGeacaEAe8mAugvP1L1psrn8IO796Csg XNSMpmRvZ294SFt55+0xVxysrgtw1sJJ0NFA/MdeBv/8Tn7YK51qENhxEfgqAKSj6MPF fBGt4mbvoq555sadRVhVOFx4svPKU8OCShZEz5lxs9OxEnGodZSuyG3bu1fPO418VtKk F3qM2XkQzRfjn2BjpMCm/CcZd44aD5zGRDJFfovxnbr0am5XMscaaMAJqUiv+snRdedz P0+ldxSJnSN+7aM2XWwe1Sy0TNQgHR93ceRzUBzRCS9dUsSi47f6sYbrcibAaLaSqeZ8 8N2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=EtW9BsqHu49N8dlu34iCKgY+jyHS/L68SYg58ApKBxc=; b=LLNvKEe714vvc0BnNap6ZZ70ncNy5qgC/q6zGQfKVAsPRN1B8RwwhPIlvzFwapHsa/ pD5KvxHyzbrjmqt1GHYbsyffS2q8f5LTcTSkZcFILcjKmJHk74VTEnxzmQwmOEPg6MV4 DRflkk4ZQNQUNdfqjFuddtYCNGFijVrEaOeJRBwH6wE/4FVcRrslhsv3GoJroUTEW58f YjQ8nrJZIeLCHaT5AEAyxXJiQAb1QTNqoe3MOd3BZps4QpiuVBdLmjp6KAn7OYNrSUPl JOmlKuereA+hKcUsggZ1MAGN+BDjXFEE9T4BH+8RYZXqbGAXZxdToHRx0kViPH2kFtma rh5g== X-Gm-Message-State: AODbwcB78Sk1ow88/bMwHk7CRcQ4/B/k/IsuKMseYDtIGbegpPZFppKI RSNbMrD6NZvIMpSy X-Received: by 10.237.46.71 with SMTP id j65mr17794343qtd.149.1496549488162; Sat, 03 Jun 2017 21:11:28 -0700 (PDT) Received: from ip-192-168-220-4.ec2.internal (ec2-54-196-5-39.compute-1.amazonaws.com. [54.196.5.39]) by smtp.gmail.com with ESMTPSA id i125sm10944290qke.16.2017.06.03.21.11.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 03 Jun 2017 21:11:27 -0700 (PDT) From: Pat Ferrel Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_84262919-1A34-47C8-814F-B6B4899B2DC5" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Use of latent informations associated to items with Mahout's SimilarityAnalysis.cooccurrences Date: Sat, 3 Jun 2017 21:11:24 -0700 In-Reply-To: Cc: actionml-user , user@predictionio.incubator.apache.org To: Marius Rabenarivo References: <7F3E0A0D-A7D4-410F-B8AE-EADBE6C99961@occamsmachete.com> X-Mailer: Apple Mail (2.3273) archived-at: Sun, 04 Jun 2017 04:11:41 -0000 --Apple-Mail=_84262919-1A34-47C8-814F-B6B4899B2DC5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Buy purchasing an item with a tag that you have given it, they are = displaying a preference for that tag. On Jun 3, 2017, at 12:36 PM, Marius Rabenarivo = wrote: So the tag here is assumed to be a tag given by the user to an item? I was thinking that it was some kind of tag we give to the item by some = mean (classification, LDA, etc) 2017-06-03 21:14 GMT+04:00 Pat Ferrel >: A =3D history of all purchases (in the e-com case) B =3D history of all tag preferences r =3D [A=E2=80=99A]h_a + [A=E2=80=99B]h_b The part in the slides about content-based recs is not needed here = because you have captured them as user preferences. On Jun 2, 2017, at 7:22 PM, Marius Rabenarivo = > wrote: Please correct side to size in my previous e-mail 2017-06-03 6:14 GMT+04:00 Marius Rabenarivo >: What will be the size of the matrix if we send an event like tag-pref=20 We will get a |U|x|T| matrix I think (where T is the set of all tags). So [AtA] will be a |T| x |T| matrix and we will do a dot product with = the user history hT to get recommendation right? I was assuming that A should be of side |U| x |I| where I is the set of = all items as it should be added to other terms of the whole enchilada = formula afterwards. Thank you for your guidance Pat. 2017-06-02 21:35 GMT+04:00 Pat Ferrel >: Please refer to the documents. The =E2=80=9Cevent=E2=80=9D is the name = of the type of event or indicator if preference, it implies the type of = the targetEntityId. So a =E2=80=9Ctag-pref=E2=80=99 event would be = accompanied by a targetEntityId =3D tag-id. This is separate from = attaching =E2=80=9Ctag=E2=80=9D properties to items with the $set event = for use with filter and boost rules. One looks at the data as a possible = preference indicator and the other is used to restrict results. This is = why we usually name events so they sound like a user preference of some = type, whereas item property values are simply item attributes, intrinsic = to the items and independent of an individual user. The event can have any name that makes sense to you. On Jun 2, 2017, at 9:19 AM, Marius Rabenarivo = > wrote: so, the event field should be the token and targetEntityId the item ID, = right? 2017-06-02 20:07 GMT+04:00 Pat Ferrel >: Yes, each is analyzed separately as a separate event. If you are using = REST you can send up to 50 events in a single array. Some SDKs may = support this too. On Jun 2, 2017, at 8:56 AM, Marius Rabenarivo = > wrote: So I have to send an event like category-preference for each tag = associated to an item right? entityId: userd-id event: category-preference targetEntityId : tag/token 2017-06-02 19:47 GMT+04:00 Pat Ferrel >: When a user expresses a preference for a tag, word or term as in search = or even in content like descriptions, these can be considered secondary = events. The most useful are tags and search terms in our experience. = Content can be used but each term/token needs to be sent as a separate = preference while search phrases can be used though again turning them = into tokens may be better. Please looks through the docs here: http://actionml.com/docs/ur = or the siide deck here: = https://www.slideshare.net/pferrel/unified-recommender-39986309 = The major innovation of CCO, the algorithm behind the UR, is the use of = these cross-domain indicators. They are not guaranteed to predict = conversions but the CCO algo tests them and weights them low if they do = not so we tend to test for strength of prediction of the entire category = of indictor and drop them if weak or set a minLLR threshold and filter = weak individual indicators out. Technically these are not called latent, that has another meaning in = Machine Learning having to do with Latent Factor Analysis. On Jun 1, 2017, at 11:26 PM, Marius Rabenarivo = > wrote: Hello everyone! Do you have an idea on how to use latent informations associated to = items like tag, word vector embedding in Mahout's = SimilarityAnalysis.cooccurrences? Regards, Marius --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-F= UgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvf= Rz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQ= uzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --=20 You received this message because you are subscribed to the Google = Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send = an email to actionml-user+unsubscribe@googlegroups.com = . To post to this group, send email to actionml-user@googlegroups.com = . To view this discussion on the web visit = https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUx= vyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com = . For more options, visit https://groups.google.com/d/optout = . --Apple-Mail=_84262919-1A34-47C8-814F-B6B4899B2DC5 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Buy purchasing an item with a tag that you have given it, = they are displaying a preference for that tag.


On = Jun 3, 2017, at 12:36 PM, Marius Rabenarivo <mariusrabenarivo@gmail.com> wrote:

So the tag here is assumed to be a tag given = by the user to an item?

I was = thinking that it was some kind of tag we give to the item by some mean = (classification, LDA, etc)

2017-06-03 21:14 GMT+04:00 Pat = Ferrel <pat@occamsmachete.com>:
A =3D history of all = purchases (in the e-com case)
B =3D history of all tag = preferences

r = =3D [A=E2=80=99A]h_a + [A=E2=80=99B]h_b

The part in the slides about = content-based recs is not needed here because you have captured them as = user preferences.


On Jun 2, 2017, at 7:22 PM, Marius = Rabenarivo <mariusrabenarivo@gmail.com> = wrote:

Please correct side to size in my previous e-mail

2017-06-03 6:14 GMT+04:00 Marius Rabenarivo <mariusrabenarivo@gmail.com>:
What will be the size of the matrix if we send an event like = tag-pref 
We will get a |U|x|T| matrix I think (where T is the set of = all tags).

So [AtA] will be a |T| x = |T| matrix and we will do a dot product with the user history hT to get = recommendation right?

I was assuming = that A should be of side |U| x |I| where I is the set of all items as it = should be added to other terms of the whole enchilada formula = afterwards.

Thank you for your = guidance Pat.

2017-06-02 21:35 GMT+04:00 Pat = Ferrel <pat@occamsmachete.com>:
Please refer to the = documents. The =E2=80=9Cevent=E2=80=9D is the name of the type of event = or indicator if preference, it implies the type of = the targetEntityId. So a =E2=80=9Ctag-pref=E2=80=99 event would be = accompanied by a targetEntityId =3D tag-id. This is separate from = attaching =E2=80=9Ctag=E2=80=9D properties to items with the $set event = for use with filter and boost rules. One looks at the data as a possible = preference indicator and the other is used to restrict results. This is = why we usually name events so they sound like a user preference of some = type, whereas item property values are simply item attributes, intrinsic = to the items and independent of an individual user.

The event can have any name that makes = sense to you.


On= Jun 2, 2017, at 9:19 AM, Marius Rabenarivo <mariusrabenarivo@gmail.com> wrote:

so, the event field should be the token and = targetEntityId the item ID, right?

2017-06-02= 20:07 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
Yes, each is analyzed separately as a separate = event. If you are using REST you can send up to 50 events in a single = array. Some SDKs may support this too.


On Jun 2, 2017, at 8:56 AM, = Marius Rabenarivo <mariusrabenarivo@gmail.com> = wrote:

So = I have to send an event like category-preference for each tag associated = to an item right?

entityId: = userd-id
event: category-preference
targetEntityId : tag/token

2017-06-02= 19:47 GMT+04:00 Pat Ferrel <pat@occamsmachete.com>:
When a user expresses a preference for a tag, = word or term as in search or even in content like descriptions, these = can be considered secondary events. The most useful are tags and search = terms in our experience. Content can be used but each term/token needs = to be sent as a separate preference while search phrases can be used = though again turning them into tokens may be better.

Please looks through the docs = here: http://actionml.com/docs/ur or the siide deck = here: https://www.slideshare.net/pferrel/unified-recommender-39986309

The = major innovation of CCO, the algorithm behind the UR, is the use of = these cross-domain indicators. They are not guaranteed to predict = conversions but the CCO algo tests them and weights them low if they do = not so we tend to test for strength of prediction of the entire category = of indictor and drop them if weak or set a minLLR threshold and filter = weak individual indicators out.

Technically these are not called = latent, that has another meaning in Machine Learning having to do with = Latent Factor Analysis.


On Jun 1, 2017, at 11:26 PM, = Marius Rabenarivo <mariusrabenarivo@gmail.com> = wrote:

Hello everyone!

Do you have an = idea on how to use latent informations associated to items like tag, = word vector embedding in Mahout's SimilarityAnalysis.cooccurrences?

Regards,

Marius

=
-- 
You received = this message because you are subscribed to the Google Groups = "actionml-user" group.
To unsubscribe from this group and = stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, send = email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com.
For more = options, visit https://groups.google.com/d/optout.





-- 
You received = this message because you are subscribed to the Google Groups = "actionml-user" group.
To unsubscribe from this group and = stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, send = email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.




-- 
You received = this message because you are subscribed to the Google Groups = "actionml-user" group.
To unsubscribe from this group and = stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, = send email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com.
For more = options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are = subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop = receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To post to this group, send = email to actionml-user@googlegroups.com.
To view this discussion on the = web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8= Ebo4xESzKUxvyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
= --Apple-Mail=_84262919-1A34-47C8-814F-B6B4899B2DC5--