lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: TermFrequencies vector limits?
Date Mon, 21 Nov 2005 18:31:11 GMT
On Monday 21 November 2005 14:28, marigoldcc@yahoo.com wrote:
> Just to make sure that I understand this correctly,
> the docs say: 
> 
> " By default, no more than 10,000 terms will be
> indexed for a field."
> 
> Given your note, then the docs do not mean that no
> more than 10,000 terms will be indexed, but that some
> smaller number of terms will be indexed and only the
> first 10,000 occurrances will be tallied.  

I'm sorry, but I don't know a good meaning for tally here.

Kind regards,
Paul Elschot

> 
> Is that correct?
> 
> Thanks
> -MG
> 
> ------ Original Message ------
> Received: Mon, 21 Nov 2005 03:04:42 AM EST
> From: Paul Elschot <paul.elschot@xs4all.nl>
> To: java-user@lucene.apache.org
> Subject: Re: TermFrequencies vector limits?
> 
> > On Monday 21 November 2005 02:16,
> marigoldcc@yahoo.com wrote:
> > > Hi.  I was wondering if anyone else has seen this
> > > before.  I'm using  lucene 1.4.3 and have indexed
> > > about 3000 text documents using the statement:
> > > 
> > > doc.add(Field.Text("contents", new FileReader(f),
> > > true));
> > > 
> > > When I go and retrieve the term frequency vectors,
> for
> > > any document under about 90k, everything looks as
> > > expected.  However for larger documents (I haven't
> > > found the exact point, but I know that those over
> 128k
> > > qualify) the sum of the term frequencies in the
> vector
> > > seems to max out at 10001.  
> > ..
> > 
> > That's correct, have a look here for
> IndexWriter.maxFieldLength :
> >
> http://wiki.apache.org/jakarta-lucene/
LuceneFAQ#head-3558e5121806fb4fce80fc022d889484a9248b71
> > 
> > Regards,
> > Paul Elschot
> > 
> > 
> 
> 
> 	
> 		
> __________________________________ 
> Yahoo! Mail - PC Magazine Editors' Choice 2005 
> http://mail.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message