predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luciano Vandi <vandi.luci...@gmail.com>
Subject Re: Customers clustering
Date Fri, 07 Jul 2017 16:31:11 GMT
Ok got it. But this way, if a customer bought 8 items from category10, and
1 from category1 and category2 it would rank high for cluster_1, even if
it's more interested in category10. Am I wrong?

2017-07-07 17:48 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:

> You'll have to work out the ES query JSON, use arrays of strings
> un-analysed.
>
> ES docs indexed
>   cluster_1: [“category 1”, “category 2”]
>   cluster_2: [“category 5”, “category 10”, …]
>
>
>   user_purchase_history: [“category 1”, “category 2”]
>
> So he query would be:  [“category 1”, “category 2”] and it would return
> the clusters with cluster_1 ranked highest.
>
> as you can see the terms in the user history can be used as a query to
> return the cluster-id that is most similar. This is called K-Nearest
> Neighbors (KNN) and is done using cosine similarity. ES (and Solr, both
> based n Lucene) are great KNN engines for sparse data.
>
>
> On Jul 7, 2017, at 4:30 AM, Luciano Vandi <vandi.luciano@gmail.com> wrote:
>
> Thanks Pat, you're right. This is what I'm trying to do.
>
> It's not clear to me how to query ElasticSearch with user’s history of
> bought item categories. Can you make an example?
>
> 2017-07-06 23:13 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:
>
>> Actually it sounds like you already have clusters that are made up of
>> categories and you want to know which cluster definition is most similar to
>> what the user has bought? If so you don’t need clustering but similarity.
>> This is pretty easy to do by putting each cluster into Elasticsearch as a
>> doc with a list of categories—so 6 or so docs, then use the user’s history
>> of bought item categories as the query, you’ll get all clusters ranked from
>> most similar (to the user’s history) to least.
>>
>> You would have to store user history on your own
>>
>> This could be put into a simple template but if you already have user
>> history, it may be overkill.
>>
>>
>>
>> On Jul 6, 2017, at 1:39 PM, Pat Ferrel <pat@occamsmachete.com> wrote:
>>
>> There are 2 clustering templates but it looks like they both need to be
>> moved from Prediction.io <http://prediction.io/> to Apache PIO, which
>> should be easy. See the template gallery here: http://predictionio.incu
>> bator.apache.org/gallery/template-gallery/
>>
>>
>> On Jul 6, 2017, at 12:35 PM, Luciano Vandi <vandi.luciano@gmail.com>
>> wrote:
>>
>> Hi there, i'm new to the mailing-list. Thanks to the guys at Apache.org
>> <http://apache.org/>, ActionML and to anyone from the community!
>>
>> I have a question regarding a project I'm working on. From a database of
>> customers/orders I would like to export buy/view events in order to assign
>> each customer to one or more of 6 predefined cluster. Each cluster reflect
>> the macro-category associated to the bought/viewed item.
>>
>> Then I would like to query a service to get all customers within a
>> cluster, or all cluster where a customer belongs.
>>
>> Is there any pio-template I should start to explore, or do I need to ask
>> a consultancy to ActionML team?
>>
>> Have a nice day!
>>
>>
>> Luciano
>> --
>>
>>
>>
>
>
> --
>
>   *Soluzioni PaaS e SaaS per il Commercio Elettronico*
>   Email: vandi.luciano@gmail.com
>   Mobile: (+39) 340 90 21 354
>
>


-- 

  *Soluzioni PaaS e SaaS per il Commercio Elettronico*
  Email: vandi.luciano@gmail.com
  Mobile: (+39) 340 90 21 354

Mime
View raw message