jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-4566) Multiplexing store support in Lucene Indexes
Date Mon, 01 Aug 2016 12:51:20 GMT

    [ https://issues.apache.org/jira/browse/OAK-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401983#comment-15401983

Chetan Mehrotra commented on OAK-4566:

Supporting multiple IndexReader on query side involves 2 things
* Creating individual iterators for LuceneResultRow for each reader and combining them
* Handle sorting

The sorting aspects makes thing tricky as QE would not be doing sorting here we need ensure
that iterators are merge sorted with comparison done on LuceneResultRow level. For that there
are 2 options
# O1 - Do comparison based on reading the value from the PropertyState. The query also has
associated NodeState which can be used to read the value of the ordered property and comparison
done based on that. Note that root NodeState bound to the query would be more recent compared
to NodeState at which index was populated/updated. May be node itself might not exist. In
such a case we might need to rely on NodeState at which index update was detected. 
# O2 - Make use of Doc values which are stored in Lucene index and then perform comparison
based on the stored value. This would involved accessing the doc value of specific property
as iterator is traversed

Had discussion with [~teofili] - Both approach are feasible and would need performance benchmark
to confirm the result. 

Note that actual sorting is still taken care by Lucene. Its just the part of merging two iterators
that requires comparison to be performed

/cc  [~tmueller] [~alex.parvulescu] [~catholicon]

> Multiplexing store support in Lucene Indexes
> --------------------------------------------
>                 Key: OAK-4566
>                 URL: https://issues.apache.org/jira/browse/OAK-4566
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.6
> Similar to OAK-3403 we need support multiplexing store in Lucene indexes. This can be
done by having multiple directories under given index definition. 
> For e.g. currently the Lucene indexes for an index /oak:index/assetIndex are stored in
node /oak:index/assetIndex/:dir. For supporting multiple indexes which get stored in different
stores we can have structure like
> {noformat}
> /oak:index/assetIndex
>      + :oak:mount1-dir
>      + :dir
> {noformat}
> In above structure index content for paths which are part of mount1 would be store in
Lucene files stores under {{:oak:mount1-dir}} while the rest go in default location {{:dir}
> # *Writing* - At the time of indexing the {{LuceneIndexEditor}} should pick up correct
writer i.e. one which is mapped to right directory node in repository
> # *Reading* - For reading we would have one {{IndexSearcher}} per directory node and
then query would be executed against both and a joined cursor would be made

This message was sent by Atlassian JIRA

View raw message