lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amir Hadadi (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-8213) offload caching to a dedicated threadpool
Date Sun, 18 Mar 2018 13:22:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16404005#comment-16404005
] 

Amir Hadadi edited comment on LUCENE-8213 at 3/18/18 1:21 PM:
--------------------------------------------------------------

"It causes a latency hit regardless"

[~rcmuir] I will clarify the issue. Assume you have these queries:

q1: term query that matches 1 doc

q2: range query that matches 10M docs

And you query for "q1 AND q2"

q2 would be an IndexOrDocValuesQuery and would use doc values to do a range check on the single
doc that matches q1, so the BKD tree won't be scanned for 10M documents.

However, when q2 gets cached, the bit set for the entire 10M docs is cached, which causes
"q1 AND q2" to suddenly spike in latency.

 


was (Author: hermes):
"It causes a latency hit regardless"

I will clarify the issue. Assume you have these queries:

q1: term query that matches 1 doc

q2: range query that matches 10M docs

And you query for "q1 AND q2"

q2 would be an IndexOrDocValuesQuery and would use doc values to do a range check on the single
doc that matches q1, so the BKD tree won't be scanned for 10M documents.

However, when q2 gets cached, the bit set for the entire 10M docs is cached, which causes
"q1 AND q2" to suddenly spike in latency.

 

> offload caching to a dedicated threadpool
> -----------------------------------------
>
>                 Key: LUCENE-8213
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8213
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/query/scoring
>    Affects Versions: 7.2.1
>            Reporter: Amir Hadadi
>            Priority: Minor
>              Labels: performance
>
> IndexOrDocValuesQuery allows to combine non selective range queries with a selective
lead iterator in an optimized way. However, the range query at some point gets cached by
a querying thread in LRUQueryCache, which negates the optimization of IndexOrDocValuesQuery
for that specific query.
> It would be nice to see a caching implementation that offloads to a different thread
pool, so that queries involving IndexOrDocValuesQuery would have consistent performance characteristics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message