Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (nike.apache.org: domain of ian.lea@gmail.com designates
 209.85.214.176 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type;
        b=RXRN7B1jvFAEpT7ktACJVdCamsaFYGMCcxIIcmDMxOwX+brJ3UcGMW4/EZpUjsd3Vo
         lnoutXAWNxkOw287Zi+VbsqwPSa9/drL5gtWG4VlN+3kLEfRKp0A9LV5yi10ITWTNlmj
         2ncgJ+0TEq9avGOlOVp3k24xSEQ7yJhT8Ce7I=
MIME-Version: 1.0
In-Reply-To: <BANLkTim_4tWtUXpahOjaRGg6L73CcWjjhw@mail.gmail.com>
References: <BANLkTimuytxpYnSRXR0qx6-H1VHxSExzKg@mail.gmail.com>
 <BANLkTiniK+3CJJpmK-ir3rtQmhf_W9MfhQ@mail.gmail.com>
 <BANLkTim_4tWtUXpahOjaRGg6L73CcWjjhw@mail.gmail.com>
From: Ian Lea <ian.lea@gmail.com>
Date: Mon, 9 May 2011 16:51:28 +0100
Message-ID: <BANLkTi=X+PiFhZnWc-yLu5EXD=DbvSH2Cw@mail.gmail.com>
Subject: Re: Sharding Techniques
To: java-user@lucene.apache.org
Content-Type: text/plain; charset=ISO-8859-1

> ...
> 1. I've not tested my application with single index as initially (a few
> years back) we thought smaller the index size (7 indexes for default 80%
> searches) the faster the search time would be ...

Possibly.  Maybe it will be acceptable to make some searches a bit
slower in order to make the slow ones faster.  But my guess would
still be that a single index with filtered searches would be quicker
for most searches.

> 2. For sharing/caching we create index readers once the server starts and
> use these throughout the server's life (1 day). At the time of searches,
> number of indexes to be read are decided by analyzing the search parameters.
> IndexSearchers are created on persistent IndexReaders and finally a
> ParallelMultiSearcher is created from these IndexSearchers (I hope this is
> not a problem, or is it???)

Sounds fine, except that in latest versions of lucene,
ParallelMultiSearcher is deprecated.

> 3. I had gone through the link you provided and some of the things are
> already implemented (e.g. readOnly=true, NIOFSDirectory, optmizing, etc.).
> We are using filters for some of the fields and caching those filters in the
> memory, through hashtable.

Sounds good.

> Will reducing number of tokens in a particular field in index reduce the
> search time (or CPU, memory etc)?

Yes, in theory, but unless the numbers are really high I doubt you'll notice it.

> E.g. I have 11 documents and tokens in field (fld1) are
> 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9 and 2.0.
>
> The query is - fld1:[ 1.0 TO 2.0 ]
>
> Would it make any difference if the tokens in documents (in the same field)
> would be
> 1,1,1,1,1,1,1,1,1,2
> ??

Unlikely.  But if those are all always numbers it would be worth
looking at NumericField and NumericRangeQuery.


--
Ian.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org