lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: TermFrequencies vector limits?
Date Mon, 21 Nov 2005 15:18:58 GMT

On 21 Nov 2005, at 08:37, Michael Curtin wrote:
> That's probably because there is a limit built into Lucene where it  
> ignores any tokens in a field past the first 10,000.  There is a  
> property you can set to increase this limit.  I dont' have the  
> source in front of me right now, but if you go into the index  
> subdirectory of the Lucene source and grep for 10000, you should  
> find it.  Let's say for purpose of argument that the name of the  
> property is "maxTokens".  Then you could just do this:
> java -Dorg.apache.lucene.maxTokens=100000" yourapp ...
> To get a higher limit.  Of course, you could also change the Lucene  
> source file and recompile it.  Note that you CANNOT just set the  
> property in your code, in general, as the Lucene class puts it into  
> a static final int, meaning it examines the value of the property  
> (once) at class load time.

Just for the record, that last paragraph is incorrect.  the  
IndexWriter.maxFieldLength variable in Lucene 1.4.3 is controllable  
at runtime, no problem.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message