lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject RE: Error: there are more terms than documents...
Date Thu, 23 Apr 2009 19:39:56 GMT
Sorry for that terrible formatting.  Let me try again.

I'm getting a strange error when I make a Lucene (2.2.0) query:

java.lang.RuntimeException: there are more terms than documents in field
"objectId", but it's impossible to sort on tokenized fields

The strange thing is that I've read the javadoc for the Sort object
where it says:

The fields used to determine sort order must be carefully chosen.
Documents must contain a single term in such a field, and the value of
the term should indicate the document's relative position in a given
sort order. The field must be indexed, but should not be tokenized, and
does not need to be stored (unless you happen to want it back with the
rest of your document data). In other words: 

document.add (new Field ("byNumber", Integer.toString(x),
Field.Store.NO, Field.Index.UN_TOKENIZED));

Therefore when I create my "objectId" field in my document I use the

doc.add(new Field("objectId", s.getObjectId(), Field.Store.NO,

Note: s.getObjectId() returns a String.

After the index is created and I print out a typical document (using the
Document.toString() method) I get this:

<id:1146513> stored/uncompressed,indexed
<_hibernate_class:com.mycompany.metadb.orm.Series> indexed
<RestrictionLevel:1> indexed,
b4e> indexed,
mycompany.metadbsync.index.CharacteristicTokenStream@daa825> indexed
<objectId:DF.SES.AA.derek.Public_01> indexed
<Name:Public 01> indexed
<UserID:derek> indexed
<Data Class:Defined Formulas> indexed
<Location:AA> indexed
<Client:SES> indexed
<DIM1:DF> indexed
<DIM2:SES> indexed
<DIM3:AA> indexed
<DIM4:derek> indexed
<DIM5:Public_01> indexed

So it looks like it got created correctly.

For what it's worth the query call looks like this:

Hits hits =, new Sort("objectId"));

The actual query is a Boolean query with lots of TermQuery clauses and
sub clauses.  The term queries are against various of the other fields
shown above, including some of the tokenized fields.  

Any help appreciated.


Bill Chesky

PS. Just as an aside, what does it mean for a field to be stored or not
stored.  Looking at the output above, the 'id' field is stored and the
'objectId' is not.  Yet both of them get displayed by the
Document.toString() method.  So even the objectId field got "stored" at
least in the sense that I understand the term (otherwise how did it get
displayed) so I'm obviously missing something about what "stored" means
in the Lucene context.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message