jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-4566) Multiplexing store support in Lucene Indexes
Date Tue, 02 Aug 2016 06:14:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403433#comment-15403433

Chetan Mehrotra commented on OAK-4566:

With MultiReader from Lucene the query side is now working in presence of multiple readers.
Pending work is modifying the JMX MBeans and later modifying suggestor etc to support it properly.
Those can be done later and we can now start the merge work for this feature.

[~alexparvulescu] Would be helpful if you can review the highlevel changes before I do the
merge. Changes can be seen [here|https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-4566]
split in multiple commits. Before modifying key parts a refactoring commit is done which just
moves out code to new class without much functional change and then only current feature related
changes are done.

Key part of the changes are. 
 *Index Side*  

{{LuceneIndexEditorContext}} now makes use of a {{LuceneIndexWriterFactory}} to construct
an instance of {{LuceneIndexWriter}} which takes care of adding {{Document}} created by {{LuceneIndexEditor}}
to actual Lucene index. So far all this logic was in {{LuceneIndexEditorContext}} which was
refactored  with [39e4867|https://github.com/chetanmeh/jackrabbit-oak/commit/39e486704bfb77dce85bc90dbbaab7fb42e828d1]

To add support for multiple writers configured per mount {{MultiplexingIndexWriter}} is introduced
which determines the {{Mount}} for path being indexed and then delegates to {{DefaultIndexWriter}}
(which has matching :data node configured like _:oak:mount-private-index-dir_)

One key difference between approach taken in PropertyIndexes is that instead of having multiple
writers which are bound to different Mounts we have a {{MultiplexingIndexWriter}} which determines
the Mount for the path and then pick up a {{DefaultIndexWriter}} configured for that Mount.
This ensures that Mount related calculations are minimized per path (done only once).

*Query Side*

{{IndexNode}} is refactored to make use of {{LuceneIndexReaderFactory}} to construct multiple
instances of {{LuceneIndexReader}} which are then wrapped in a {{MultiReader}} (if more than
1 otherwise the reader is used directly). So far all this logic was present in {{IndexNode}}
which was moved out with [f73df0d4d428|https://github.com/chetanmeh/jackrabbit-oak/commit/f73df0d4d4288ae90ab29b3ca7e5939b8a14da1c]

> Multiplexing store support in Lucene Indexes
> --------------------------------------------
>                 Key: OAK-4566
>                 URL: https://issues.apache.org/jira/browse/OAK-4566
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.6
> Similar to OAK-3403 we need support multiplexing store in Lucene indexes. This can be
done by having multiple directories under given index definition. 
> For e.g. currently the Lucene indexes for an index /oak:index/assetIndex are stored in
node /oak:index/assetIndex/:dir. For supporting multiple indexes which get stored in different
stores we can have structure like
> {noformat}
> /oak:index/assetIndex
>      + :oak:mount1-dir
>      + :dir
> {noformat}
> In above structure index content for paths which are part of mount1 would be store in
Lucene files stores under {{:oak:mount1-dir}} while the rest go in default location {{:dir}
> # *Writing* - At the time of indexing the {{LuceneIndexEditor}} should pick up correct
writer i.e. one which is mapped to right directory node in repository
> # *Reading* - For reading we would have one {{IndexSearcher}} per directory node and
then query would be executed against both and a joined cursor would be made

This message was sent by Atlassian JIRA

View raw message