predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Honders <dennishond...@gmail.com>
Subject Re: Similar product template
Date Tue, 18 Apr 2017 08:25:57 GMT
Hello Pat,

First of all, thanks a lot for the great explanation and the link to the
PowerPoint. I think it already helped me a lot understanding the algorithms
behind the templates. I also have some new questions based on the email and
the PowerPoint.


   1. I currently understand matrix factorization as finding latent factors
   that (the latent factors) describe hidden relations between users and
   items. Is this correct?
   1. And for finding hidden latent factors, different algorithms exist
      like cooccurrence, ALS and correlated cross occurrence. Is this correct?
   2. In the PowerPoint at the ALS algorithm: 'U' describes dimensionally
   reduced users by “features”. What are the features here? "features are
   projection parameters into a space that is optimized to reduce an error
   function" I don't exactly understand what is meant here.
   3. I also watched your video about the cooccurence algorithm (
   https://www.youtube.com/watch?v=LWAY_XeoQoc). From the description in
   the email about ALS, I don't see the difference between ALS and the
   Coocccurrence algorithm as explained in the youtube video.
   4. The Correlated Cross-Occurrence could be seen as an expansion of the
   Cooccurrence algorithm to make it multi-domain and multi-modal?
   5. From the email: "It does give good results for the top ranked though
   when you have lots of “conversions” per user on average because ALS can
   only use conversions as input. in other words it can use only one kind of
   behavior data." For confirmation: Behavior data is like buy, view, etc?
   6. From the email: "It does this for all users and so finds which of the
   indicators most often lead to conversion." What do you mean with conversion
   (also saw it in the PowerPoint)?

Greetings,

Dennis

2017-04-14 15:18 GMT+02:00 Vaghawan Ojha <vaghawan781@gmail.com>:

> Sorry the email sent accidentally without finishing, it would be really
> helpful for me if you describe about in which case the multi model are
> being used.
>
> On Fri, Apr 14, 2017 at 7:01 PM, Vaghawan Ojha <vaghawan781@gmail.com>
> wrote:
>
>> Hi Pat,
>>
>> This is really a great explanation, I myself had tried ALS before CCO,
>> but in my case CCO seems better. You had a nice presentation, but I was
>> quite confused regarding multi-model recommendation.
>>
>> In what case does UR make use of multi model? For say, I've a location
>> preference for every user event, and category preference as well. Let's say
>> I trained the model and queried with the preference parameter, in that case
>> is it using multi model for each preference?
>>
>> If you could describe a bit about this, it would be reall
>>
>> On Thu, Apr 13, 2017 at 9:15 PM, Pat Ferrel <pat@occamsmachete.com>
>> wrote:
>>
>>> I’m surprised that ALS seemed clear because is is based on a complicated
>>> matrix factorization algorithm that transforms the user vectors into a
>>> smaller dimensional space that is composed of “important” features. These
>>> are not interactions with items like “buys”, they can only be described as
>>> defining a new feature space. The factorized matrices transform in and out
>>> of that space. The factorized matrices are approximations of user x
>>> features, and features x items.
>>>
>>> The user’s history is transformed into the feature space, which will be
>>> dense, in other words indicating some preference for all features. Then
>>> when this dense user vector is transformed back into item space the
>>> approximation nature of ALS will give some preference value for all items.
>>> At this point they can be ranked by score and the top few returned. This is
>>> clearly wrong since user will never have a preference for all items and
>>> would never purchase or convert on a large number of them no mater what the
>>> circumstances. It does give good results for the top ranked though when you
>>> have lots of “conversions” per user on average because ALS can only use
>>> conversions as input. in other words it can use only one kind of behavior
>>> data.
>>>
>>> The CCO (Correlated Cross-Occurrence) algorithm from Mahout that is
>>> behind the Universal Recommender is multi-domain and multi-modal, in that
>>> takes interactions of the user from many actions they perform and even
>>> contextual data like profile info or location. It takes all this and finds
>>> which “indicators”, a name for these interactions or other user info, and
>>> compares them with the user’s conversions. It does this for all users and
>>> so finds which of the indicators most often lead to conversion. These
>>> highly correlated indicators are then associated with items as properties,
>>> When a user recommendation is needed we see which items have the most
>>> similar behavioral indicators as the user's history. This tells us that the
>>> user probably has an affinity for the item—we can predict a preference for
>>> these items.
>>>
>>> The differences:
>>> 1) ALS can ingest only one type of behavior. This is not bad but also
>>> not very flexible and requires a good number of these interactions per user.
>>> 2) Cross-behavioral recommendations cannot be made with ALS since no
>>> cross behavioral data is seen by it. This in turn means that users with few
>>> or no conversions will not get recommendations. The Universal Recommender
>>> can make recommendations to users with no conversions if they have other
>>> behavior to draw from so it is generally said to handle cool-start for
>>> user’s better. Another way to say this is that “cold-start” for ALS is
only
>>> “cool-start” for CCO (in the UR). The same goes for item-based
>>> recommendations.
>>> 3) CCO can also use content directly for similar item recommendations,
>>> which helps solve the item “cold-start” problem. ALS cannot.
>>> 4) CCO is more like a landscape of Predictive AI algorithms using all we
>>> know about a user from multiple domains (conversions, page views, search
>>> terms, category preferences, tag preferences, brand preferences, location,
>>> device used, etc) to make predictions in some specific domain. It can also
>>> work with conversions alone
>>> 5) To do queries with ALS in the MLlib requires that the factorized
>>> matrices be in-memory. They are much smaller than the input but this means
>>> running Spark to make queries. This makes it rather heavy-weight for
>>> queries and makes scaling a bit of a problem and fairly complicated (too
>>> much to explain here). CCO on the other hand uses Spark only to create the
>>> indicators model, which it puts in Elasticsearch. Elasticsearch finds the
>>> top ranked items compared to the user’s history at runtime in real-time.
>>> This makes scaling queries as easy as scaling Elasticsearch since it was
>>> meant to scale.
>>>
>>> I have done cross-validaton comparisons but they are a bit unfair and
>>> the winner depends on the dataset, In real-life CCO serves more users than
>>> ALS since it uses more behavior and so tends to win for this reason. It’s
>>> nearly impossible to compare this with cross-validation so A/B tests are
>>> our only metric.
>>>
>>> We have a slide deck showing some of these comparisons here:
>>> https://docs.google.com/presentation/d/1HpHZZiRmHpMKtu
>>> 86rOKBJ70cd58VyTOUM1a8OmKSMTo/edit?usp=sharing
>>>
>>>
>>> On Apr 13, 2017, at 2:39 AM, Dennis Honders <dennishonders@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> I was using the similar product template. (I'm not a data scientist)
>>> The template is using the ALS algorithm and the Cooccurrence algortihm.
>>>
>>> The ALS algorithm is quite good described on the Apache Spark MLlib
>>> website. The Apache Mahout documentation about the cooccurrence algorithm
>>> is quite general described and it is not clear what the differences are
>>> between these algorithms. They both use matrixes to describe relations but
>>> use a different approach to factorize the matrices?
>>>
>>> I also like to know a bit more about the parameters of both algorithms,
>>> in the engine.json. What could be the impact of changing the values?
>>>
>>>    - ALS: rank, nIterations, lambda and seed.
>>>    - Cooccurrence: "n"
>>>
>>> The algorithms bring different results. Is there a general way of
>>> comparing these results?
>>>
>>> Greetings,
>>>
>>> Dennis
>>>
>>>
>>
>

Mime
View raw message