lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Materna, Wolf-Dietrich (empolis B)" <Wolf-Dietrich.Mate...@empolis.com>
Subject RE: Size limit for indexing ?
Date Wed, 09 Oct 2002 08:33:35 GMT
Hello,
> I use lucene 1.2 and I index a text document wich size is near 500 ko.
> (I use Field.UnStored method)
> It seems that only the beginning of this document is indexing !
> If I search a term that is at the end of this document, I 
> don't find it (but
> If find term at the beginning).
> So, I split my document in 2 parts and index them, and now it 
> works fine.
> 
> Is there a limit size for indexing a document ?
You are right. There is a limit for the number of terms for each field, but
you can
change it. Look at org.apache.lucene.index.IndexWriter for maxFieldLength.
The default limit is set to 10000 terms. A 500k document contains more terms
depending on stopwords and number of white spaces. That why the end of your
document
was ignored.
Regards,

-- 
Wolf-Dietrich Materna
Development
 
empolis GmbH -  arvato knowledge management 
Kekul├ęstr. 7 
12489 Berlin, Germany
 
phone :  +49-30-6780-6510
fax :    +49-30-6780-6549
 
< <mailto:Wolf-Dietrich.Materna@empolis.com>> < <http://www.empolis.com>>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message