lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3441) Add NRT support to LuceneTaxonomyReader
Date Mon, 19 Nov 2012 13:26:58 GMT


Shai Erera updated LUCENE-3441:

    Attachment: LUCENE-3441.patch

Patch introduces NRT support by doing the following:

* Add a constructor which takes DirTaxoWriter, from which DirTaxoReader obtains the internal
IndexWriter instance, to obtain NRT readers.

* Remove refresh() in exchange for a static TaxonomyReader.openIfChanged. Similar to DirectoryReader,
the method either returns null if no changes were made to the taxonomy, or a new TR instance

* Extracted the logic of creating the ParentArray and ChildrenArrays from DirTaxoReader into
their own classes. As a result:
** DirTaxoReader code greatly simplified
** These classes are now immutable, which simplified even more the logic of DirTaxoReader.

* TaxonomyReader made abstract class instead of an interface, and few methods (e.g. close(),
incRef(), decRef() etc.) were pulled to it from DirTaxoReader and made final.

Not strictly related, but I could not avoid these changes too:

* Removed the over-verbosing in DirTaxoReader. Some is unnecessary anymore b/c DirTaxoReader
is simplified, other was just too much IMO.

* Improved the documentation of the different methods, again mostly by shortening them and
keep them focused.

NOTE: I put a CHANGES entry under the back-compat section of 4.1. I intend to commit this
to 4.x, and it is sort of a back-compat break, even though a simple one.

There's one nocommit which I'd love if someone can take a look at and perhaps propose a solution.
I documented it there, but I'll repeat the issue here - DirTaxoReader maintains two LRU caches
which I'd like to share with the new instance returned from openIfChanged. Currently the code
copies them fully, which is not so efficient in an NRT case.

While I could just share the instance, I'm worried that two TR instances have e.g. the ability
to change the cache size, or add/remove entries from it.

Also note the weird behavior I mentioned about cloning the cache, as opposed to add it all
to a new instance. I still didn't get to the bottom of why cloning the cache is so horribly
slow, but adding it to a fresh new instance is so cheap ...
> Add NRT support to LuceneTaxonomyReader
> ---------------------------------------
>                 Key: LUCENE-3441
>                 URL:
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>         Attachments: LUCENE-3441.patch
> Currently LuceneTaxonomyReader does not support NRT - i.e., on changes to LuceneTaxonomyWriter,
you cannot have the reader updated, like IndexReader/Writer. In order to do that we need to
do the following:
> # Add ctor to LuceneTaxonomyReader to allow you to instantiate it with LuceneTaxonomyWriter.
> # Add API to LuceneTaxonomyWriter to expose its internal IndexReader
> # Change LTR.refresh() to return an LTR, rather than void. This is actually not strictly
related to that issue, but since we'll need to modify refresh() impl, I think it'll be good
to change its API as well. Since all of facet API is @lucene.experimental, no backwards issues
here (and the sooner we do it, the better).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message