lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cedric Ho" <cedric...@gmail.com>
Subject Re: large term vectors
Date Mon, 11 Feb 2008 02:18:47 GMT
Is it a single index ? My index is also in the 200G range, but I never
managed to get
a single index of size > 20G and still get acceptable performance (in
both searching and updating).
So I split my indexes into chunks of < 10G

I am curious as to how you manage such a single large index.

Cedric



On Feb 8, 2008 11:51 PM,  <marc.dumontier@thomson.com> wrote:
> Hi,
>
>
>
> I have a large index which is around 275GB. As I search different parts
> of the index, the memory footprint grows with large byte arrays being
> stored. They never seem to get unloaded or GC'ed. Is there any way to
> control this behavior so that I can periodically unload cached
> information?
>
>
>
> The nature of the data being indexed doesn't allow me to reduce the
> number of terms per field, although I might be able to reduce the number
> of overall fields (I have some which aren't currently being searched
> by).
>
>
>
> I've just begun investigating and profiling the problem, so I don't have
> a lot of details at this time. Any support would be extremely welcome.
>
>
>
> Thanks,
>
>
>
> Marc Dumontier
> Manager, Software Development
> Thomson Scientific (Canada)
> 1 Yonge Street, Suite 1801
> Toronto, Ontario M5E 1W7
>
>
>
> Direct +1 416 214 3448
> Mobile +1 416 454 3147
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message