jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hang <jh...@bea.com>
Subject Lucene index
Date Wed, 18 Apr 2007 23:23:41 GMT

After spending some time running Jackrabbit in debug mode, I noticed some
peculiar behavior in the Lucene SearchIndex implementation.  

When indexing of a Node occurs via the AbstractIndex.addDocument() method,
the Lucene Document object being indexed seems to contain all the indexed
fields, i.e. all the properties of the node, the extracted fulltext terms,
etc.  

However, during a search operation, on the call to
SearchIndex.executeQuery(), the Document objects being returned from the
search only contains some of the indexed fields.  In fact for all of the
Document objects, only these 5 fields are present:

_:UUID
_:PARENT
_:PROPERTIES[0] "3:versionHistory"
_:PROPERTIES[1] "3:baseVersion"
_:PROPERTIES[2] "3:predecessor"

I know that Jackrabbit only really needs the _:UUID field so that it can
look up the Node, so is it stripping out the other fields at some point? 

We've noticed that for large result sets (1000+ nodes), the performance can
drag because each Node lookup requires at least one database query.  Since
we are only interested in data contained in the Lucene index, it would be
nice if we would get that data from the index and not have to go through the
Jackrabbit PM at all.

Does anyone know if this is possible?
-- 
View this message in context: http://www.nabble.com/Lucene-index-tf3604049.html#a10069152
Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.


Mime
View raw message