lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject RE: Error: there are more terms than documents...
Date Thu, 23 Apr 2009 20:31:04 GMT
Doron, thanks for the reply.

> Is it possible that, for at least one document, multiple "objectId"
> were created?
> This would also create this problem.

I read that online as well.  I don't think so.  We do have an update
process that updates the index.  During the update process we have the

// create new doc object ... objectId will always be the same as before
but // other fields may change
Document doc = getDocument(s);  
// replace old doc for objectId w/ new
indexWriter.updateDocument(new Term("objectId", s.getObjectId()), doc);

However, the getDocument() method is the same method that we use to
create a brand new document when building the index from scratch.  And
I'm sure we only create the "objectId" field once in that method.  

Unless maybe I'm misunderstanding something about the
IndexWriter.updateDocument() method.  I thought it would delete all
documents that matched the Term passed and add a new one.

Unless maybe there is an issue with my Term argument passed to
updateDocument and it's really not matching the way I think it is and so
we end of with two different documents with the same value in the
"objectId" field.  Could this situation cause the exception?

> The printed document object is the same document object that was
> for indexing. But when a document is read from the index (via
> API)
> it will only contain the stored fields. For instance, assume that at
> time you would like to get the URL of a result document and display it
to the user.
> For this
> you can at indexing time add the URL to a stored field.
> Doron

Ah I think I understand.  I was printing those docs out as I stored
them.  That makes total sense now.  Thanks for the help.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message