jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] Created: (JCR-106) Minimize use of fields in lucene index
Date Wed, 13 Apr 2005 15:01:16 GMT
Minimize use of fields in lucene index
--------------------------------------

         Key: JCR-106
         URL: http://issues.apache.org/jira/browse/JCR-106
     Project: Jackrabbit
        Type: Improvement
  Components: query  
 Environment: svn revision: 161184
    Reporter: Marcel Reutegger
    Priority: Minor


Currently every property name creates a field in the lucene index, bloating the size of the
index because of the norm files created for each field.

When values are indexed as is (not tokenized for fulltext indexing), then the property name
may be part of the term text. That way lucene must only maintain one field for all property
names. With this approach the search terms are always a combination of property name and literal
value. e.g. instead of using TermQuery(new Term("prop", "foo")) the query must be TermQuery(new
TermQuery("common-field", "prop:foo")). this works for general comparison / value comparison
operators and also for the like function. the contains function uses the fulltext index which
uses a different field anyway.

Using the property name as part of the indexed term text, requires a custom SortComparator
which is aware of the property name.

This change will not be backward compatible with earlier indexes created by jackrabbit.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


Mime
View raw message