lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shruthi <>
Subject RE: NewBie To Lucene || Perfect configuration on a 64 bit server
Date Tue, 20 May 2014 08:40:06 GMT

-----Original Message-----
From: Toke Eskildsen []
Sent: Tuesday, May 20, 2014 12:57 PM
Subject: Re: NewBie To Lucene || Perfect configuration on a 64 bit server

On Mon, 2014-05-19 at 12:40 +0200, Shruthi wrote:

> 1.       Client makes a request with a search phrase. Lucene

> application indexes a list of 500 documents(at max. ) and searches the

> phrase on the index constructed.

Fetching from NAS + indexing sounds like something that would take a

second or two. Have you tried this?

Shruthi : We haven’t yet tried from NAS..but we kept local storage of 500 documents(all
are RTF’s so we used Aspose to convert to text before indexing) and on a 4GB machine , with
RAM director implementation

Just the indexing took 20 seconds ☹

We are yet to try on 64 bit server to check if that would change drastically.

> We have decided to use MMapDirectory for above requirement.

As your index data are extremely transient and the datasets small,

RAMDirectory seems a better choice.

Shruthi : But RAM DIrectory  has bad concurrency on multithreaded environments.

You state that you delete the index when the search has finished.

Wouldn't it be better to keep it a couple of minutes? That way further

searches from the same client would be fast.

Shruthi : The same user from the same client will not be searching for same phrase again unless
he has amnesia. This was already discussed with our architects.I did not have any selling
point on Lucene in this aspect.

Overall, I worry about your architecture. It scales badly with the

number of documents/client. You might not have any clients with more

than 500 documents right now, but can you be sure that this will not


Shruthi:  Actually we have a DB query that runs prior to indexing which fetches max. 500 docs
from 10million+ docs in NASSHARE. We then have to apply search phrase only on the resultant
set..So this way

The set is just limited to 500 -1000.

Thanks a lot for taking interest. Wish to hear more from you.

- Toke Eskildsen, State and University Library, Denmark


To unsubscribe, e-mail:

For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message