Return-Path: X-Original-To: apmail-jackrabbit-commits-archive@www.apache.org Delivered-To: apmail-jackrabbit-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CE20108AA for ; Wed, 9 Sep 2015 09:35:50 +0000 (UTC) Received: (qmail 92207 invoked by uid 500); 9 Sep 2015 09:35:49 -0000 Delivered-To: apmail-jackrabbit-commits-archive@jackrabbit.apache.org Received: (qmail 92156 invoked by uid 500); 9 Sep 2015 09:35:49 -0000 Mailing-List: contact commits-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list commits@jackrabbit.apache.org Received: (qmail 92147 invoked by uid 99); 9 Sep 2015 09:35:49 -0000 Received: from eris.apache.org (HELO hades.apache.org) (140.211.11.105) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Sep 2015 09:35:49 +0000 Received: from hades.apache.org (localhost [127.0.0.1]) by hades.apache.org (ASF Mail Server at hades.apache.org) with ESMTP id 9C51BAC009D for ; Wed, 9 Sep 2015 09:35:49 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1701956 - /jackrabbit/site/live/oak/docs/query/lucene.html Date: Wed, 09 Sep 2015 09:35:49 -0000 To: commits@jackrabbit.apache.org From: chetanm@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20150909093549.9C51BAC009D@hades.apache.org> Author: chetanm Date: Wed Sep 9 09:35:49 2015 New Revision: 1701956 URL: http://svn.apache.org/r1701956 Log: OAK-3367 - Boosting fields not working as expected Publish the updated doc Modified: jackrabbit/site/live/oak/docs/query/lucene.html Modified: jackrabbit/site/live/oak/docs/query/lucene.html URL: http://svn.apache.org/viewvc/jackrabbit/site/live/oak/docs/query/lucene.html?rev=1701956&r1=1701955&r2=1701956&view=diff ============================================================================== --- jackrabbit/site/live/oak/docs/query/lucene.html (original) +++ jackrabbit/site/live/oak/docs/query/lucene.html Wed Sep 9 09:35:49 2015 @@ -1,13 +1,13 @@ - + Jackrabbit Oak - Lucene Index @@ -210,7 +210,7 @@
boost
-
If the property is included in nodeScopeIndex then it defines the boost done for the index value against the given property name. Boost currently does not work as expected due to OAK-3367
+
If the property is included in nodeScopeIndex then it defines the boost done for the index value against the given property name. See Boost and Search Relevancy for more details
index
Determines if this property should be indexed. Mostly useful for fulltext index where some properties need to be excluded from getting indexed.
useInExcerpt
@@ -999,6 +999,42 @@ - codec = "Lucene46"

Refer to OAK-2853 for details. Enabling the Lucene46 codec would lead to smaller and compact indexes.

+

+
+

Boost and Search Relevancy

+

@since Oak 1.2.5

+

When fulltext indexing is enabled then internally Oak would create a fulltext field which consists of text extracted from various other fields i.e. fields for which nodeScopeIndex is true. This allows search like //*[jcr:contains(., 'foo')] to perform search across any indexable field containing foo (See contains function for details)

+

In certain cases its desirable that those nodes where the searched term is present in a specific property are ranked higher (come earlier in search result) compared to those node where the searched term is found in some other property.

+

In such cases it should be possible to boost specific text contributed by individual property. Meaning that if a title field is boosted more than description, then search result would those node coming earlier where searched term is found in title field

+

For that to work ensure that for each such property (which need to be preferred) both nodeScopeIndex and analyzed are set to true. In addition you can specify boost property so give higher weightage to values found in specific property

+

Note that even without setting explicit boost and just setting nodeScopeIndex and analyzed to true would improve the search result due to the way Lucene does scoring. Internally Oak would create separate Lucene fields for those jcr properties and would perform a search across all such fields. For more details refer to OAK-3367

+ +
+
  + indexRules
+    - jcr:primaryType = "nt:unstructured"
+    + app:Asset
+      + properties
+        - jcr:primaryType = "nt:unstructured"
+        + description
+          - nodeScopeIndex = true
+          - analyzed = true
+          - name = "jcr:content/metadata/jcr:description"
+        + title
+          - analyzed = true
+          - nodeScopeIndex = true
+          - name = "jcr:content/metadata/jcr:title"
+          - boost = 2.0
+
+

With above index config a search like

+ +
+
SELECT
+  *
+FROM [app:Asset] 
+WHERE 
+  CONTAINS(., 'Batman')
+
+

Would have those node (of type app:Asset) come first where Batman is found in jcr:title. While those nodes where search text is found in other field like aggregated content would come later

LuceneIndexProvider Configuration