lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christine Poerschke (BLOOMBERG/ LONDON)" <cpoersc...@bloomberg.net>
Subject RE: LTR feature extraction performance issues
Date Tue, 31 Oct 2017 12:48:01 GMT
Hi Brian,

I just tried to explore the scenario you describe with the techproducts example and am able
to see what you see:

# step 1: start solr with techproducts example and ltr enabled
# step 2: upload one feature (originalScore) and one model using that feature
# step 3: examine cache stats via the Admin UI (all zero to start with)
# step 4: run a query which includes feature extraction e.g. [features] in fl
# step 5: examine cache stats to see lookups but no inserts
# step 6: run a query with feature extraction _and_ re-ranking using the model
# step 7: examine cache stats to see both lookups and inserts

Looking around the code the cache insert happens in FeatureLogger.java [1] which is called
by the Rescorer [2] and this would allow the 'fl' feature logging to reuse the feature values
calculated as part of the 'rq' re-ranking.

However, if there was no feature value in the cache (because no 'rq' re-ranking happened)
then the feature value is calculated by LTRFeatureLoggerTransformerFactory.java [3] and based
on code inspection the results of that calculation are not added to the cache.

It might be interesting to explore if/how that logic [3] could be changed.

--Christine

[1] https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/FeatureLogger.java#L51-L60
[2] https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRRescorer.java#L185-L205
[3] https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.1.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java#L267-L280

----- Original Message -----
From: solr-user@lucene.apache.org
To: solr-user@lucene.apache.org
At: 10/30/17 16:55:14

I'm still having this issue. Does anyone have LTR feature extraction successfully running
and have cache inserts/hits?

--Brian

-----Original Message-----
From: Brian Yee [mailto:byee@wayfair.com] 
Sent: Tuesday, October 24, 2017 12:14 PM
To: solr-user@lucene.apache.org
Subject: RE: LTR feature extraction performance issues

Hi Alessandro,

Unfortunately some of my most important features are query dependent. I think I found an issue
though. I don't think my features are being inserted into the cache. Notice "cumulative_inserts:0".
There are a lot of lookups, but since there appear to be no values in the cache, the hitratio
is 0.

stats:
cumulative_evictions:0
cumulative_hitratio:0
cumulative_hits:0
cumulative_inserts:0
cumulative_lookups:215319
evictions:0
hitratio:0
hits:0
inserts:0
lookups:3303
size:0
warmupTime:0


My configs look are as follows:

<cache name="QUERY_DOC_FV" class="solr.search.LRUCache" size="4096" initialSize="2048"
autowarmCount="4096" regenerator="solr.search.NoOpRegenerator" />

  <queryParser name="ltr" class="org.apache.solr.ltr.search.LTRQParserPlugin"/>

  <transformer name="features" class="org.apache.solr.ltr.response.transform.LTRFeatureLoggerTransformerFactory">
    <str name="fvCacheName">QUERY_DOC_FV</str>
    <str name="defaultFormat">sparse</str>
  </transformer>

Would anyone have any idea why my features are not being inserted into the cache? Is there
an additional config setting I need?


--Brian

-----Original Message-----
From: alessandro.benedetti [mailto:a.benedetti@sease.io] 
Sent: Monday, October 23, 2017 10:01 AM
To: solr-user@lucene.apache.org
Subject: Re: LTR feature extraction performance issues

It strictly depends on the kind of features you are using.
At the moment there is just one cache for all the features.
This means that even if you have 1 query dependent feature and 100 document dependent feature,
a different value for the query dependent one will invalidate the cache entry for the full
vector[1].

You may look to optimise your features ( where possible).

[1]  https://issues.apache.org/jira/browse/SOLR-10448



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Mime
View raw message