predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno LEBON <b.le...@redfakir.fr>
Subject Re: Items blacklisted in the query made to Elasticsearch by UR
Date Thu, 06 Apr 2017 07:31:11 GMT
*BTW I assume "user": "069bbbbd-8661-453f-8c89-ac50aea0c0d8” has those
items in their “facet” history? Otherwise I’m not sure where they’d come
from.*

Yes I confirm that this user has those items in his facet history.

2017-04-05 18:18 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:

> Ok thanks for ruling out a couple things, I’ll take a look at this.
>
> BTW I assume "user": "069bbbbd-8661-453f-8c89-ac50aea0c0d8” has those
> items in their “facet” history? Otherwise I’m not sure where they’d come
> from.
>
>
> On Apr 5, 2017, at 2:31 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:
>
> Yes, we have pio 0.10.0, the UR v 0.5.0 and the blacklistEvents is
> disabled.
>
> We sent this kind of query to Pio
> { "user": "069bbbbd-8661-453f-8c89-ac50aea0c0d8", "num": 11 }
>
> The JSON generated for Elasticsearch is:
>
>
> *{"size":11,"query":{"bool":{"should":[{"terms":{"facet":["estag_begin-couleur-noir-estag_end","cocooning","sexy","charme","estag_begin-taille-105h-estag_end","estag_begin-taille-4-estag_end","estag_begin-primadonna-estag_end","transparent","estag_begin-aubade-estag_end","estag_begin-couleur-rouge-estag_end","une-piece","estag_begin-simone-perele-estag_end","maintien","moins-de-20-euros-intervalle-de-prix","estag_begin-taille-taille-unique-estag_end","estag_begin-moins-50-pour-cent-estag_end","elasthanne","blouse","body","coque","string","slip","estag_begin-taille-95a-estag_end"]}},{"terms":{"view":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[],"must_not":{"ids":{"values":["estag_begin-taille-95a-estag_end","string","estag_begin-aubade-estag_end","slip","elasthanne","coque","body","blouse","estag_begin-moins-50-pour-cent-estag_end","estag_begin-primadonna-estag_end","estag_begin-taille-taille-unique-estag_end","moins-de-20-euros-intervalle-de-prix","maintien","estag_begin-simone-perele-estag_end","une-piece","estag_begin-couleur-rouge-estag_end","transparent","sexy","estag_begin-taille-4-estag_end","estag_begin-taille-105h-estag_end","charme","cocooning","estag_begin-couleur-noir-estag_end"],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}*
>
> We also set *"returnSelf": true *as we want every item to be recommended
> to the user.
>
>
>
> 2017-04-04 17:30 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:
>
>> Ok, so you are using pio 0.10.0, the UR v 0.5.0 and have disabled the
>> blacklistEvents as shown below?
>>
>> Then when you query for a user you are not getting all items returned?
>>
>> Can you share an example of the query you send to pio and the JSON that
>> is created for Elasticsearch?
>>
>>
>> On Apr 4, 2017, at 7:13 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:
>>
>> Hi,
>>
>> Sorry my bad, I searched for the piece of code on the internet and your
>> repository came first. I had the right repo in prod (
>> https://github.com/actionml/universal-recommender), I doubled check.
>> Sorry for the misunderstanding.
>>
>> We still have the same problem.
>>
>> Here the engine.json we use:
>> *{*
>> *  "comment":"",*
>> *  "id": "default",*
>> *  "description": "settings",*
>> *  "engineFactory": "org.template.RecommendationEngine",*
>> *  "datasource": {*
>> *    "params" : {*
>> *      "name": "sample-handmade-data.txt",*
>> *      "appName": "piourcluster",*
>> *      "eventNames": ["facet","view"]*
>> *    }*
>> *  },*
>> *  "sparkConf": {*
>> *    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",*
>> *    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io
>> <http://sparkbindings.io/>.MahoutKryoRegistrator",*
>> *    "spark.kryo.referenceTracking": "false",*
>> *    "spark.kryoserializer.buffer": "300m",*
>> *    "es.index.auto.create": "true",*
>> *    "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"*
>> *  },*
>> *"algorithms": [*
>> *    {*
>> *      "name": "ur",*
>> *      "params": {*
>> *        "appName": "piourcluster",*
>> *        "indexName": "urindex",*
>> *        "typeName": "items",*
>> *        "eventNames": ["facet", "view"],*
>> *        "blacklistEvents": [],*
>> *        "maxEventsPerEventType": 50000,*
>> *        "maxCorrelatorsPerEventType": 50,*
>> *        "maxQueryEvents": 100,*
>> *        "num": 11,*
>> *        "rankings": [*
>> *          {*
>> *            "name": "popRank",*
>> *            "type": "popular"*
>> *          }*
>> *        ],*
>> *        "returnSelf": true*
>> *      }*
>> *    }*
>> *  ]*
>> *}*
>>
>> We don't have a blacklist in our query, the query is basic, we use the
>> Java API giving it the user id and the number of recommendation we want
>> back.
>>
>>
>> 2017-03-31 21:55 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:
>>
>>> you should not be using code from that repo. See the pio template
>>> gallery, it points to the correct template. My personal version is for
>>> experimental branches.
>>>
>>> The repo is here: https://github.com/actionml/universal-recommender
>>>
>>> The function is here: https://github.com/actio
>>> nml/universal-recommender/blob/master/src/main/scala/URAlgor
>>> ithm.scala#L634 and looks like it is doing the right thing.
>>>
>>> Try it with UR v0.5.0 from the correct repo and if it doesn’t work, I’ll
>>> take a look. Please send along the engine.json you used. just to be sure we
>>> are on the same page. BTW are you using a blacklist in your query also?
>>> Please give an example query.
>>>
>>>
>>> On Mar 31, 2017, at 6:45 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:
>>>
>>> Hi,
>>>
>>> Thanks for your answer. We tried that already but it doesnt change
>>> anything, we still have blacklisted items (primary events mainly or only
>>> from what I see).
>>>
>>> I think the piece of code in charge of blacklisting is this one: (from
>>> here https://github.com/pferrel/template-scala-parallel-univ
>>> ersal-recommendation/blob/master/src/main/scala/URAlgorithm.scala)
>>>
>>> *  /** Create a list of item ids that the user has interacted with or
>>> are not to be included in recommendations */*
>>> *  def getExcludedItems(userEvents: Seq[Event], query: Query):
>>> Seq[String] = {*
>>>
>>> *    val blacklistedItems = userEvents.filter { event =>*
>>> *      // either a list or an empty list of filtering events so honor
>>> them*
>>> *      blacklistEvents match {*
>>> *        case Nil => modelEventNames.head equals event.event*
>>> *        case _   => blacklistEvents contains event.event*
>>> *      }*
>>> *    }.map(_.targetEntityId.getOrElse("")) ++
>>> query.blacklistItems.getOrEmpty.distinct*
>>>
>>> *    // Now conditionally add the query item itself*
>>> *    val includeSelf = query.returnSelf.getOrElse(returnSelf)*
>>> *    val allExcludedItems = if (!includeSelf && query.item.nonEmpty)
{*
>>> *      blacklistedItems :+ query.item.get*
>>> *    } // add the query item to be excuded*
>>> *    else {*
>>> *      blacklistedItems*
>>> *    }*
>>> *    allExcludedItems.distinct*
>>> *  }*
>>>
>>> But my knowledge of Scala is very limited, so I dont understand the
>>> details. Does it say that if the parameter blacklistEvents is empty, aka =
>>> [], then no events are to be excluded (plus/minus the includeSelf option).
>>>
>>> Do I have the right version of UR? (https://github.com/pferrel/te
>>> mplate-scala-parallel-universal-recommendation)
>>>
>>> 2017-03-30 20:00 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:
>>>
>>>> *"blacklistEvents": [[]], should be **"blacklistEvents": [],*
>>>>
>>>>
>>>> On Mar 30, 2017, at 8:57 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:
>>>>
>>>> Hello,
>>>>
>>>> We test the universal recommender on a cluster made following the
>>>> tutorial from actionML. Once the build/train/deploy is done we send PIO a
>>>> request to get recommendation.
>>>> For example:
>>>> *curl -H "Content-Type: application/json" -d '{ "user":
>>>> "4e810ef4-977a-4f04-b585-cf2c2996ec93", "num": 11
>>>> }' http://localhost:8001/queries.json <http://localhost:8001/queries.json>*
>>>>
>>>> In the pio.log we see the requests made to Elasticsearch. They look
>>>> like:
>>>>
>>>> *{"size":11,"query":{"bool":{"should":[{"terms":{"facet":["estag_begin-couleur-noir-estag_end","cocooning","sexy","charme","estag_begin-taille-105h-estag_end","estag_begin-taille-4-estag_end","estag_begin-primadonna-estag_end","transparent","estag_begin-aubade-estag_end","estag_begin-couleur-rouge-estag_end","une-piece","estag_begin-simone-perele-estag_end","maintien","moins-de-20-euros-intervalle-de-prix","estag_begin-taille-taille-unique-estag_end","estag_begin-moins-50-pour-cent-estag_end","elasthanne","blouse","body","coque","string","slip","estag_begin-taille-95a-estag_end"]}},{"terms":{"view":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[],"must_not":{"ids":{"values":["estag_begin-taille-95a-estag_end","string","estag_begin-aubade-estag_end","slip","elasthanne","coque","body","blouse","estag_begin-moins-50-pour-cent-estag_end","estag_begin-primadonna-estag_end","estag_begin-taille-taille-unique-estag_end","moins-de-20-euros-intervalle-de-prix","maintien","estag_begin-simone-perele-estag_end","une-piece","estag_begin-couleur-rouge-estag_end","transparent","sexy","estag_begin-taille-4-estag_end","estag_begin-taille-105h-estag_end","charme","cocooning","estag_begin-couleur-noir-estag_end"],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}*
>>>>
>>>> The important part is the fact that there is a must_not that is not
>>>> empty. We want it to be empty, we have the following engine.json:
>>>> *{*
>>>> *  "comment":"",*
>>>> *  "id": "default",*
>>>> *  "description": "settings",*
>>>> *  "engineFactory": "org.template.RecommendationEngine",*
>>>> *  "datasource": {*
>>>> *    "params" : {*
>>>> *      "name": "sample-handmade-data.txt",*
>>>> *      "appName": "piourcluster",*
>>>> *      "eventNames": ["facet","view"]*
>>>> *    }*
>>>> *  },*
>>>> *  "sparkConf": {*
>>>> *    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",*
>>>> *    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io
>>>> <http://sparkbindings.io/>.MahoutKryoRegistrator",*
>>>> *    "spark.kryo.referenceTracking": "false",*
>>>> *    "spark.kryoserializer.buffer": "300m",*
>>>> *    "es.index.auto.create": "true",*
>>>> *    "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"*
>>>> *  },*
>>>> *"algorithms": [*
>>>> *    {*
>>>> *      "name": "ur",*
>>>> *      "params": {*
>>>> *        "appName": "piourcluster",*
>>>> *        "indexName": "urindex",*
>>>> *        "typeName": "items",*
>>>> *        "eventNames": ["facet", "view"],*
>>>> *        "blacklistEvents": [[]],*
>>>> *        "maxEventsPerEventType": 50000,*
>>>> *        "maxCorrelatorsPerEventType": 50,*
>>>> *        "maxQueryEvents": 100,*
>>>> *        "num": 11,*
>>>> *        "rankings": [*
>>>> *          {*
>>>> *            "name": "popRank",*
>>>> *            "type": "popular"*
>>>> *          }*
>>>> *        ],*
>>>> *        "returnSelf": true*
>>>> *      }*
>>>> *    }*
>>>> *  ]*
>>>> *}*
>>>>
>>>> From what we understand the fact that we have an array containing an
>>>> empty array for the parameter blacklistEvents tells UR that we don't want
>>>> any event to be blacklisted, not even the primary one.
>>>> We also added the parameter returnSelf : true to ask UR not to
>>>> blacklist any items part of the query.
>>>>
>>>> So why do we have blacklisted events in our query (ie the must_not part
>>>> of it) ?
>>>>
>>>> (Note that when we do a change in the engine.json and launch a deploy,
>>>> we see in the log some parameters value appearing, thus we know we modify
>>>> the right engine.json file.)
>>>>
>>>> Regards
>>>> Bruno
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "actionml-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to actionml-user+unsubscribe@googlegroups.com.
> To post to this group, send email to actionml-user@googlegroups.com.
> To view this discussion on the web visit https://groups.google.
> com/d/msgid/actionml-user/CAMeWnoQoUHEwXyNSetVbO14B9Vjia
> -PgmOZuxp_GPLRisobe6w%40mail.gmail.com
> <https://groups.google.com/d/msgid/actionml-user/CAMeWnoQoUHEwXyNSetVbO14B9Vjia-PgmOZuxp_GPLRisobe6w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

Mime
View raw message