jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Saurabh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8166) Index definition with orderable property definitions with and without functions breaks index
Date Thu, 04 Apr 2019 01:14:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809416#comment-16809416

Vikas Saurabh commented on OAK-8166:

[~nitigup], I think that if we do a post processing to filter out dup props then we leave
ourselves open to bugs where we might add (unintentional) duplicate property. In current case
even the value is same as coming from both property definition but the buggy case might not
hold that true and might actually index incorrect value.
Also, btw, do note that dup field or not - your analysis shows that function index is indexing
value of parameters too. So, there's unnecessary addition to index (for case where non-func
prop def doesn't exist).

A few things that might be useful (this is from my memory of last time I was working in the
area... so, some details might be slightly off... broad points should be accurate though):
* the matcher things are used for 2 cases:
*# figuring out if a change can make a change in indexed data (at least checked at {{LuceneIndexEditor#propertyUpdated}})
*# actually indexing relative things
For this case, we need matchers for parameters of the function in question - it can be a list
for example in coalesce function that we support. If any of the parameter properties change
in the repository then the function value would change and hence it should get flagged as
"can make a change in index".
The second point though isn't relevant from function pov as pushing indexed value is handled
at {{FulltextDocumentMaker#indexFunctionRestrictions}}.

Here's an idea that should work:
diff --git a/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/Aggregate.java
index 0dad9ae6c5..385a6bfc9f 100644
--- a/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/Aggregate.java
+++ b/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/Aggregate.java
@@ -427,6 +427,17 @@ public class Aggregate {
+    public static class FunctionInclude extends PropertyInclude {
+        public FunctionInclude(PropertyDefinition pd) {
+            super(pd);
+        }
+        @Override
+        public void collectResults(String nodePath, NodeState nodeState, ResultCollector
results) {
+            // Function includes aren't indexed using aggregate of property parameters of
the function itself
+        }
+    }
     public interface ResultCollector {
         void onResult(NodeIncludeResult result);
diff --git a/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java
index f203585037..98d73c8161 100644
--- a/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java
+++ b/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/IndexDefinition.java
@@ -1182,7 +1182,7 @@ public class IndexDefinition implements Aggregate.AggregateMapper {
                         for (String p : properties) {
                             if (PathUtils.getDepth(p) > 1) {
                                 PropertyDefinition pd2 = new PropertyDefinition(this, p,
-                                propAggregate.add(new Aggregate.PropertyInclude(pd2));
+                                propAggregate.add(new Aggregate.FunctionInclude(pd2));
                         // a function index has no other options

I'm not completely confident though as I might be missing some edge case. We can discuss this
further but a few cases I think we should check - ordering is working, changing a property
gets indexed ... both for normal and relative properties. We can discuss this on a call as
well if you want.

> Index definition with orderable property definitions with and without functions breaks
> --------------------------------------------------------------------------------------------
>                 Key: OAK-8166
>                 URL: https://issues.apache.org/jira/browse/OAK-8166
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.8.12
>            Reporter: Tom Blackford
>            Priority: Major
>         Attachments: OAK-8166_1.patch
> If an index definition contains the same orderable property with and without functions,
it will fail to index any node which contains that property. The failure will be logged as
> Steps to reproduce:
> * Configure index with the two property definitions shown at [2].
> * Refresh the index definition
> * Modify a node that falls under the definition - it will fail with the exception shown
at [1]
> * Modify the 'non-function' index definition to not be orderable (orderable=false)
> * Refresh the index definition
> * Modify the same node - note there is no exception.
> Thanks to [~catholicon] for assistance identifying root cause.
> [1]
> {code}
> 25.03.2019 15:39:04.135 *WARN* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor
Failed to index the node [/content/dam/Unknown-2.png]
> java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/metadata/dc:title"
appears more than once in this document (only one value is allowed per field)
> 	at org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62)
> 	at org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125)
> 	at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36)
> 	at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236)
> 	at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
> 	at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455)
> 	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) [org.apache.jackrabbit.oak-lucene:1.8.9]
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateDocument(DefaultIndexWriter.java:86)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:258)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:140)
> 	at org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74)
> {code}
> [2] 
> {code}
> "dcTitle": {
>     "jcr:primaryType": "nt:unstructured",
>     "nodeScopeIndex": "true",
>     "useInSuggest": "true",
>     "ordered": "true",
>     "propertyIndex": "true",
>     "useInSpellcheck": "true",
>     "name": "jcr:content/metadata/dc:title",
>     "boost": "2.0"
>     },
>   "dcTitleLowercase": {
>     "jcr:primaryType": "nt:unstructured",
>     "ordered": "true",
>     "propertyIndex": "true",
>     "function": "fn:lower-case(jcr:content/metadata/@dc:title)"
>     }
> {code}

This message was sent by Atlassian JIRA

View raw message