lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: How to improve document retrieval speed.
Date Sat, 04 Nov 2006 19:14:39 GMT
I would strongly suggest not storing these fields in lucene, just keep them as files and store
some kind of url to get them latter. that will boost your speed heavily. If you really, really
need to store documents in lucene, try some compression
Also, so many fields hurt performance, any chance you could pack some of them together?

Did you have  a chance to read Lucene in Action, could help you greatly. 

good luck

----- Original Message ----
From: Sunil Kumar PK <pksunilpk@gmail.com>
To: java-user@lucene.apache.org
Sent: Saturday, 4 November, 2006 4:12:57 PM
Subject: Re: How to improve document retrieval speed.

Hi,

I am using Lucene's Remote Parallel Multisearcher with 10 nodes in my search
cluster having 200+ distributed index fragments (1 Index fragment = 4GB). I
have 30+ fields in my index, and I am storing a master XML file (contains 5
to 30 pages of information) in one field. I also  have two web servers
behind a load-balancer.

My system works as follows.

I) When the user search for a document, the search result page will display
3 stored fields from the index (Document Number,Title and Date).

II) On clicking a particular result I am searching again(Since I am using a
load-balancer I can't use the same RemoteSearchable object) for that
particular 'document number' and retrieve the stored master XML from the
index.

The first step is ok with me, but I want to improve the speed in the second
part. Is it necessary to do the scoring and weight calculation in the second
step?

Thanks,
Sunil

On 11/4/06, Grant Ingersoll <gsingers@apache.org> wrote:
>
> You probably can skip the QueryParser part and just construct a
> TermQuery with your term and field.  That will save you a few ticks.
> I'm betting you have just included the code below for example, so
> this may not apply, however, you want to make sure you aren't
> creating the IndexSearcher every time you want to run the query.  See
> the list archives for info on warming/caching the searcher.
>
> What kind of runtimes are you experiencing?
>
> On Nov 4, 2006, at 6:46 AM, Sunil Kumar PK wrote:
>
> > Hi,
> >
> > In my index there is a unique field, "MY_DOCNO".
> >
> > If I want get a document from the index with MY_DOCNO=1000,  I am
> > using
> > following code,
> >
> > IndexSearcher isearcher = new IndexSearcher("myindex1");
> > QueryParser qp = new QueryParser("MY_DOCNO", new StandardAnalyzer());
> >
> > Query query = qp.parse("MY_DOCNO:1000");
> > Hits hits = isearcher.search(query);
> >
> > Since I have a unique field in my index, is there any other method,
> > that can
> > retrieve the document faster than this?
> >
> > Thanks,
> > Sunil
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>





		
___________________________________________________________ 
All New Yahoo! Mail – Tired of Vi@gr@! come-ons? Let our SpamGuard protect you. http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message