lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <soko...@ifactory.com>
Subject Re: new to lucene, non standard index
Date Fri, 06 May 2011 10:33:35 GMT
I believe creating a large number of fields is not a good match w/the 
underlying architecture, and you'd be better off w/a large number of 
documents/small number of fields, where the same field occurs in every 
document.  There is some discussion here: 
http://markmail.org/message/hcmt5syca7zdeac6.

-Mike

On 5/5/2011 7:00 PM, Chris Schilling wrote:
> Hey Mike,
>
> My only concern is that I am replacing a large number of fields inside of a Document
with a (very large ~50e6) number of Documents.  Will I not run into the same memory issues?
 Or do I create only one doc object and reuse it?  With so many Doc/Token pairs, won't searching
the index take a lot more time?
>
> Thanks for your help,
> Chris
>
> On May 5, 2011, at 3:11 PM, Mike Sokolov wrote:
>
>> I think the solution I gave you will work.  The only problem is if a token appears
twice in the same doc:
>>
>> doc1 has foo with two different sets of weights and frequencies...
>>
>> but I think you're saying that doesn't happen
>>
>> On 05/05/2011 06:09 PM, Chris Schilling wrote:
>>> Hey Mike,


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message