lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Askar Zaidi" <askar.za...@gmail.com>
Subject Re: Fine Tuning Lucene implementation
Date Tue, 24 Jul 2007 21:44:42 GMT
Can someone please tell me how to cache results in Lucene ? I know the
classes, but I don't know how to go about it.

thanks,
Askar

On 7/24/07, Askar Zaidi <askar.zaidi@gmail.com> wrote:
>
> Thanks for the reply.
>
> I am timing the entire search process with a stop watch, a bit ghetto
> style. My getXXX methods are:
>
> Document doc = hits.doc(i);
> String str = doc.get("item");
>
> So you can see that I am retrieving the entire document in a search query.
> Ideally , I'd like to just retrieve the Field object that I want to run the
> search on. I know this will give me a boost as one of my Fields is really
> huge.
>
> My query is selecting the entire user data-set in the database. I'd like
> to do some SQL based search in the query too so that I pick only those items
> where the phrase matches.
>
> Index contains about 650MB of data. Index file size is 14478869 bytes.
>
> thanks,
> AZ
>
>
> On 7/24/07, Grant Ingersoll <gsingers@apache.org> wrote:
> >
> > Where are you getting your numbers from?  That is, where are your
> > timers?  Are you timing the rs.next() loop, or the individual calls
> > to Lucene?  What do the getXXXXX methods look like?  How big are your
> > queries?  How big is your index?
> >
> > Essentially, we need more info to really help you.  From what I can
> > tell, you are generating 3 different Lucene queries for each record
> > in the database.  Frankly, I surprised your slowdown is only linear.
> >
> > On Jul 24, 2007, at 4:31 PM, Askar Zaidi wrote:
> >
> > > I have 512MB RAM allocated to JVM Heap. If I double my system RAM
> > > from 768MB
> > > to say 2GB or so, and give JVM 1.5GB Heap space, will I get quicker
> > > results
> > > ?
> > >
> > > Can I expect results which take 1 minute to be returned in 30
> > > seconds with
> > > more RAM ? Should I also get a more powerful CPU ? A real server class
> > > machine ?
> > >
> > > I have also done some of the optimizations that are mentioned on
> > > the Lucene
> > > website.
> > >
> > > thanks,
> > > AZ
> > >
> > > On 7/24/07, Askar Zaidi < askar.zaidi@gmail.com> wrote:
> > >>
> > >> Hey Guys,
> > >>
> > >> I just finished up using Lucene in my application. I have data in a
> > >> database , so while indexing I extract this data from the database
> > >> and pump
> > >> it into the index. Specifically , I have the following data in the
> > >> index:
> > >>
> > >> <itemID> <tags> <title> <summary> <contents>
> > >>
> > >> where itemID is just a number (primary key in the DB)
> > >> tags : text
> > >> titie: text
> > >> summary: text
> > >> contents: Huge text (text extracted from files: pdfs, docs etc).
> > >>
> > >> Now while running a search query I realized that the response time
> > >> increases in a linear fashion as the number of <itemID> increase
> > >> in the DB.
> > >>
> > >> If I have 50 items, its 8 seconds
> > >> 100 items, its 17 seconds.
> > >> 300+ items, its 60 seconds and maybe more.
> > >>
> > >> In a perfect world, I'd like to search on 300+ items within 10-15
> > >> seconds.
> > >> Can anyone give me tips to fine tune lucene ?
> > >>
> > >> Heres a code snippet:
> > >>
> > >> sql query = "SELECT itemID from items where creator = 'askar' ;
> > >>
> > >> --execute query--
> > >>
> > >> while(rs.next ()){
> > >>
> > >> score = doTagSearch(askar,text,itemID);
> > >> scoreTitle = doTitleSearch(askar,text,itemID);
> > >> scoreSummary = doSummarySearch(askar,text,itemID);
> > >>
> > >> ----
> > >>
> > >> }
> > >>
> > >> So this code asks Lucene to search for the "text" in the itemID
> > >> passed.
> > >> itemID is already indexed. The while loop will run 300 times if
> > >> there are
> > >> 300 items....that gets slow...what can I do here ??
> > >>
> > >> thanks for the replies,
> > >>
> > >> AZ
> > >>
> >
> > --------------------------
> > Grant Ingersoll
> > Center for Natural Language Processing
> > http://www.cnlp.org/tech/lucene.asp
> >
> > Read the Lucene Java FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message