predictionio-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno LEBON <b.le...@redfakir.fr>
Subject Re: Items blacklisted in the query made to Elasticsearch by UR
Date Fri, 31 Mar 2017 13:45:49 GMT
Hi,

Thanks for your answer. We tried that already but it doesnt change
anything, we still have blacklisted items (primary events mainly or only
from what I see).

I think the piece of code in charge of blacklisting is this one: (from here
https://github.com/pferrel/template-scala-parallel-universal-recommendation/blob/master/src/main/scala/URAlgorithm.scala
)

*  /** Create a list of item ids that the user has interacted with or are
not to be included in recommendations */*
*  def getExcludedItems(userEvents: Seq[Event], query: Query): Seq[String]
= {*

*    val blacklistedItems = userEvents.filter { event =>*
*      // either a list or an empty list of filtering events so honor them*
*      blacklistEvents match {*
*        case Nil => modelEventNames.head equals event.event*
*        case _   => blacklistEvents contains event.event*
*      }*
*    }.map(_.targetEntityId.getOrElse("")) ++
query.blacklistItems.getOrEmpty.distinct*

*    // Now conditionally add the query item itself*
*    val includeSelf = query.returnSelf.getOrElse(returnSelf)*
*    val allExcludedItems = if (!includeSelf && query.item.nonEmpty) {*
*      blacklistedItems :+ query.item.get*
*    } // add the query item to be excuded*
*    else {*
*      blacklistedItems*
*    }*
*    allExcludedItems.distinct*
*  }*

But my knowledge of Scala is very limited, so I dont understand the
details. Does it say that if the parameter blacklistEvents is empty, aka =
[], then no events are to be excluded (plus/minus the includeSelf option).

Do I have the right version of UR? (
https://github.com/pferrel/template-scala-parallel-universal-recommendation)

2017-03-30 20:00 GMT+02:00 Pat Ferrel <pat@occamsmachete.com>:

> *"blacklistEvents": [[]], should be **"blacklistEvents": [],*
>
>
> On Mar 30, 2017, at 8:57 AM, Bruno LEBON <b.lebon@redfakir.fr> wrote:
>
> Hello,
>
> We test the universal recommender on a cluster made following the tutorial
> from actionML. Once the build/train/deploy is done we send PIO a request to
> get recommendation.
> For example:
> *curl -H "Content-Type: application/json" -d '{ "user":
> "4e810ef4-977a-4f04-b585-cf2c2996ec93", "num": 11 }'
> http://localhost:8001/queries.json <http://localhost:8001/queries.json>*
>
> In the pio.log we see the requests made to Elasticsearch. They look like:
>
> *{"size":11,"query":{"bool":{"should":[{"terms":{"facet":["estag_begin-couleur-noir-estag_end","cocooning","sexy","charme","estag_begin-taille-105h-estag_end","estag_begin-taille-4-estag_end","estag_begin-primadonna-estag_end","transparent","estag_begin-aubade-estag_end","estag_begin-couleur-rouge-estag_end","une-piece","estag_begin-simone-perele-estag_end","maintien","moins-de-20-euros-intervalle-de-prix","estag_begin-taille-taille-unique-estag_end","estag_begin-moins-50-pour-cent-estag_end","elasthanne","blouse","body","coque","string","slip","estag_begin-taille-95a-estag_end"]}},{"terms":{"view":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[],"must_not":{"ids":{"values":["estag_begin-taille-95a-estag_end","string","estag_begin-aubade-estag_end","slip","elasthanne","coque","body","blouse","estag_begin-moins-50-pour-cent-estag_end","estag_begin-primadonna-estag_end","estag_begin-taille-taille-unique-estag_end","moins-de-20-euros-intervalle-de-prix","maintien","estag_begin-simone-perele-estag_end","une-piece","estag_begin-couleur-rouge-estag_end","transparent","sexy","estag_begin-taille-4-estag_end","estag_begin-taille-105h-estag_end","charme","cocooning","estag_begin-couleur-noir-estag_end"],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}*
>
> The important part is the fact that there is a must_not that is not empty.
> We want it to be empty, we have the following engine.json:
> *{*
> *  "comment":"",*
> *  "id": "default",*
> *  "description": "settings",*
> *  "engineFactory": "org.template.RecommendationEngine",*
> *  "datasource": {*
> *    "params" : {*
> *      "name": "sample-handmade-data.txt",*
> *      "appName": "piourcluster",*
> *      "eventNames": ["facet","view"]*
> *    }*
> *  },*
> *  "sparkConf": {*
> *    "spark.serializer": "org.apache.spark.serializer.KryoSerializer",*
> *    "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io
> <http://sparkbindings.io>.MahoutKryoRegistrator",*
> *    "spark.kryo.referenceTracking": "false",*
> *    "spark.kryoserializer.buffer": "300m",*
> *    "es.index.auto.create": "true",*
> *    "es.nodes":"espionode1:9200,espionode2:9200,espionode3:9200"*
> *  },*
> *"algorithms": [*
> *    {*
> *      "name": "ur",*
> *      "params": {*
> *        "appName": "piourcluster",*
> *        "indexName": "urindex",*
> *        "typeName": "items",*
> *        "eventNames": ["facet", "view"],*
> *        "blacklistEvents": [[]],*
> *        "maxEventsPerEventType": 50000,*
> *        "maxCorrelatorsPerEventType": 50,*
> *        "maxQueryEvents": 100,*
> *        "num": 11,*
> *        "rankings": [*
> *          {*
> *            "name": "popRank",*
> *            "type": "popular"*
> *          }*
> *        ],*
> *        "returnSelf": true*
> *      }*
> *    }*
> *  ]*
> *}*
>
> From what we understand the fact that we have an array containing an empty
> array for the parameter blacklistEvents tells UR that we don't want any
> event to be blacklisted, not even the primary one.
> We also added the parameter returnSelf : true to ask UR not to blacklist
> any items part of the query.
>
> So why do we have blacklisted events in our query (ie the must_not part of
> it) ?
>
> (Note that when we do a change in the engine.json and launch a deploy, we
> see in the log some parameters value appearing, thus we know we modify the
> right engine.json file.)
>
> Regards
> Bruno
>
>
>
>
>
>

Mime
View raw message