lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: documents with large numbers of fields
Date Fri, 18 May 2007 21:39:40 GMT
On 18-May-07, at 1:01 PM, charlie w wrote:
> So now I have the idea to invert the field name and value thusly:
> foo=tag     ^2
> bar=tag     ^1.2
> foobar=tag    ^1.8
> and search "foo:tag".
>
> Intuitively, I would expect Lucene to be optimized for searching  
> the values
> of fields, and not really the names of fields.  In a somewhat large  
> index,
> say 10 million documents, will Lucene search performance continue  
> to be
> acceptable if I load up documents with many fields like this?

Perhaps not.  Storing a field with norms occupies O(N) space,  
regardless of the number of document with non-zero norms.  There  
might be too much data for the os to cache and lucene to process  
efficiently.

> Is there an upper limit on the number of fields comprising a  
> document, and
> if so what is it?

There is not.  They are relatively costless if omitNorms=False

> Or, is there some way to make my original approach work after all?

The experimental Payloads allows an optional boost to be stored along  
with term position.  This is the intended use case.

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message