lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3441) Add NRT support to LuceneTaxonomyReader
Date Mon, 19 Nov 2012 14:30:58 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500263#comment-13500263
] 

Shai Erera commented on LUCENE-3441:
------------------------------------

bq. why not have a single instance of the LRUCache for all time, and never call .clear() on
it?

That will help as long as previous TR instances are indeed on their way to die. Otherwise,
if e.g. an app, for some reason, reopens a TR but doesn't close the old one and uses both
(again, for some really unknown reason), then two TR instances might affect each other.

Now, since that's a very stupid thing to do, I'm not sure that I care about this much, as
long as we preserve correctness. Meaning, that that one instance may reduce the size of the
cache, while another increases it - that's the app problem. That that the two instances might
evict entries from the LRU cache left and center, that's the app problem.

The correctness issues that I'm worried about is (suppose that TR-1 and TR-2 share the same
instance):
* TR-1 looks for a category "foo", doesn't find it and adds to the cache the fact that the
category is unknown
* TR-2 looks for the category "foo", which exists in its newer version of the taxonomy, and
receives the ordinal -1, which denotes that the category doesn't exist --- WRONG !!

To solve that, we could not store the fact that a category does not exist in the cache. Really,
this shouldn't happen - apps do not ask the taxonomy for random categories. So then:

* TR-1 looks for a category "foo", doesn't find it in the cache and DOES NOT update the cache
w/ that info. It goes to disk, doesn't find it there, returns -1.
* TR-2 looks for the category "foo", which exists in its newer version of the taxonomy, fetches
it from disk and stores the ordinal in the cache.
* TR-1 looks for the category "foo" again, now receives an ordinal which is larger than its
taxonomy size --- might be a problem !!

In general, since I don't think that apps access the taxonomy for random ordinals or categories,
the second solution might be good. Never store in the cache the fact that an ordinal/category
is not found + don't clear() the cache, only nullify its reference + hope for the best :)?
                
> Add NRT support to LuceneTaxonomyReader
> ---------------------------------------
>
>                 Key: LUCENE-3441
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3441
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>         Attachments: LUCENE-3441.patch
>
>
> Currently LuceneTaxonomyReader does not support NRT - i.e., on changes to LuceneTaxonomyWriter,
you cannot have the reader updated, like IndexReader/Writer. In order to do that we need to
do the following:
> # Add ctor to LuceneTaxonomyReader to allow you to instantiate it with LuceneTaxonomyWriter.
> # Add API to LuceneTaxonomyWriter to expose its internal IndexReader
> # Change LTR.refresh() to return an LTR, rather than void. This is actually not strictly
related to that issue, but since we'll need to modify refresh() impl, I think it'll be good
to change its API as well. Since all of facet API is @lucene.experimental, no backwards issues
here (and the sooner we do it, the better).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message