lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: large term vectors
Date Mon, 11 Feb 2008 15:05:03 GMT

http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/document/Field.Index.html#NO_NORMS

?


11 feb 2008 kl. 15.55 skrev <marc.dumontier@thomson.com>:

> Hi Grant,
>
> Lucene 2.2.0
>
> I'm not actually explicitely storing term vectors. It seems the huge
> amount of byte arrays is actually coming from SegmentReader.norms.  
> Maybe
> that cache constantly grows as I read somewhere that it's on-demand.  
> I'm
> not using any field or document boosting..is there some way to  
> optimize
> around this?
>
> Marc
>
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Monday, February 11, 2008 7:46 AM
> To: java-user@lucene.apache.org
> Subject: Re: large term vectors
>
> Hi Marc,
>
> Can you give more info about what your field properties are?  Your
> subject line implies you are storing term vectors, is that the case?
>
> Also, what version of Lucene are you using?
>
> Cheers,
> Grant
>
> On Feb 8, 2008, at 10:51 AM, <marc.dumontier@thomson.com>
> <marc.dumontier@thomson.com
>> wrote:
>
>> Hi,
>>
>>
>>
>> I have a large index which is around 275GB. As I search different
>> parts
>> of the index, the memory footprint grows with large byte arrays being
>> stored. They never seem to get unloaded or GC'ed. Is there any way to
>> control this behavior so that I can periodically unload cached
>> information?
>>
>>
>>
>> The nature of the data being indexed doesn't allow me to reduce the
>> number of terms per field, although I might be able to reduce the
>> number
>> of overall fields (I have some which aren't currently being searched
>> by).
>>
>>
>>
>> I've just begun investigating and profiling the problem, so I don't
>> have
>> a lot of details at this time. Any support would be extremely  
>> welcome.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Marc Dumontier
>> Manager, Software Development
>> Thomson Scientific (Canada)
>> 1 Yonge Street, Suite 1801
>> Toronto, Ontario M5E 1W7
>>
>>
>>
>> Direct +1 416 214 3448
>> Mobile +1 416 454 3147
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message