lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christophe GOGUYER DESSAGNES" <...@arcadsoftware.com>
Subject Re: Size limit for indexing ?
Date Wed, 09 Oct 2002 09:24:25 GMT
Thank you for your help, it solved my problem.

-----
Christophe

----- Message d'origine -----
De : "Materna, Wolf-Dietrich (empolis B)"
<Wolf-Dietrich.Materna@empolis.com>
À : "'Lucene Users List'" <lucene-user@jakarta.apache.org>
Envoyé : mercredi 9 octobre 2002 10:33
Objet : RE: Size limit for indexing ?


Hello,
> I use lucene 1.2 and I index a text document wich size is near 500 ko.
> (I use Field.UnStored method)
> It seems that only the beginning of this document is indexing !
> If I search a term that is at the end of this document, I
> don't find it (but
> If find term at the beginning).
> So, I split my document in 2 parts and index them, and now it
> works fine.
>
> Is there a limit size for indexing a document ?
You are right. There is a limit for the number of terms for each field, but
you can
change it. Look at org.apache.lucene.index.IndexWriter for maxFieldLength.
The default limit is set to 10000 terms. A 500k document contains more terms
depending on stopwords and number of white spaces. That why the end of your
document
was ignored.
Regards,

--
Wolf-Dietrich Materna
Development

empolis GmbH -  arvato knowledge management
Kekuléstr. 7
12489 Berlin, Germany

phone :  +49-30-6780-6510
fax :    +49-30-6780-6549

< <mailto:Wolf-Dietrich.Materna@empolis.com>> < <http://www.empolis.com>>

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message