lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: TermFrequencies vector limits?
Date Mon, 21 Nov 2005 08:04:17 GMT
On Monday 21 November 2005 02:16, marigoldcc@yahoo.com wrote:
> Hi.  I was wondering if anyone else has seen this
> before.  I'm using  lucene 1.4.3 and have indexed
> about 3000 text documents using the statement:
> 
> doc.add(Field.Text("contents", new FileReader(f),
> true));
> 
> When I go and retrieve the term frequency vectors, for
> any document under about 90k, everything looks as
> expected.  However for larger documents (I haven't
> found the exact point, but I know that those over 128k
> qualify) the sum of the term frequencies in the vector
> seems to max out at 10001.  
...

That's correct, have a look here for IndexWriter.maxFieldLength :
http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message