lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex McManus" <alex.mcma...@fdgroup.com>
Subject RE: multivalue fields
Date Mon, 17 May 2004 09:34:45 GMT

> Maybe your fields are too long so that only part of it gets indexed (look
at IndexWriter.maxFieldLength).

This is interesting, I've had a look at the JavaDoc and I think I
understand. The maximum field length describes the maximum number of unique
terms, not the maximum number of words/tokens. Therefore, even if I have a
4Gb field, I could quite safely have a maxFieldLength of, say, 100k words
which should safely handle the maximum number of unique words, rather than
800 million which would be needed to handle every token.

Is this correct? 

Is 100k a worrying maxFieldLength, in terms of how much memory this would
consume?

Does Lucene issue a warning if this limit is exceeded during indexing (it
would be quite worrying if it was silently discarding terms)?

Thanks in advance,

Alex.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message