lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: high memory usage by indexreader
Date Thu, 21 Mar 2013 10:43:54 GMT
That number of docs is far more than I've ever worked with but I'm
still surprised it takes 4 minutes to initialize an index reader.

What exactly do you mean by initialization?  Show us the code that
takes 4 minutes.

What version of lucene?  What OS?  What disks?


--
Ian.


On Wed, Mar 20, 2013 at 6:21 PM, ash nix <nixdash@gmail.com> wrote:
> Thanks Ian.
>
> Number of documents in index is 381,153,828.
> The data set size is 1.9TB.
> The index size of this dataset is 290G. It is single index.
> The following are the fields indexed for each of the document.
>
> 1. Document id : It is StoredField and is generally around 128 chars or more.
> 2. Text field:  It is TextField  and not stored.
> 3. Title : it is a Textfield and not stored.
> 4. anchor : It is Textfield and not stored.
> 5. Timestamp : DoubleDocValue field and not stored. Actually this
> should be DoubleField and I need to fix it.
>
> Initialization of indexreader at the start of search takes approximately 4 min.
> After initialization , I am executing a series of Boolean AND queries
> of 2-3 terms. Each search result is dumped with some information on
> score and doc id in a output file.
>
> The resident size (RES) of process is 1.7 Gigs.
> The total virtual memory (VIRT) is 307 Gig.
>
> Do you think this is okay?
> Do you think I should use Solr instead of using lucene core?
>
> I have times tamps for document and so I can split into multiple
> indexes sorted on chronology.
>
> Thanks,
> Ashwin
>
> On Wed, Mar 20, 2013 at 1:43 PM, Ian Lea <ian.lea@gmail.com> wrote:
>> Searching doesn't usually use that much memory, even on large indexes.
>>
>> What version of lucene are you on?  How many docs in the index?  What
>> does a slow query look like (q.toString()) and what search method are
>> you calling?  Anything else relevant you forgot to tell us?
>>
>>
>> Or google "lucene sharding" if you are determined to split the index.
>>
>>
>> --
>> Ian.
>>
>>
>> On Wed, Mar 20, 2013 at 5:12 PM, ash nix <nixdash@gmail.com> wrote:
>>> Hi Everybody,
>>>
>>> I have created a single compound index which is of size 250 Gigs.
>>> I open a single index reader to search simple boolean queries.
>>> The process is consuming lot of memory search painfully slow.
>>>
>>> It seems that I will have to create multiple indexes and have multiple
>>> index readers.
>>> Can anyone suggest me good blog or documentation on creating multiple
>>> indexes and performing parallel search.
>>>
>>> --
>>> Thanks,
>>> A
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>
> --
> Thanks,
> A
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message