lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Materna, Wolf-Dietrich (empolis B)" <>
Subject RE: Size limit for indexing ?
Date Wed, 09 Oct 2002 08:33:35 GMT
> I use lucene 1.2 and I index a text document wich size is near 500 ko.
> (I use Field.UnStored method)
> It seems that only the beginning of this document is indexing !
> If I search a term that is at the end of this document, I 
> don't find it (but
> If find term at the beginning).
> So, I split my document in 2 parts and index them, and now it 
> works fine.
> Is there a limit size for indexing a document ?
You are right. There is a limit for the number of terms for each field, but
you can
change it. Look at org.apache.lucene.index.IndexWriter for maxFieldLength.
The default limit is set to 10000 terms. A 500k document contains more terms
depending on stopwords and number of white spaces. That why the end of your
was ignored.

Wolf-Dietrich Materna
empolis GmbH -  arvato knowledge management 
Kekul├ęstr. 7 
12489 Berlin, Germany
phone :  +49-30-6780-6510
fax :    +49-30-6780-6549
< <>> < <>>

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message