lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Indexing and Searching fields that have unique values
Date Fri, 23 Apr 2010 04:16:34 GMT
Hi Ravi,

Adding to what Erick said, you could do index the numbers as numeric fields
instead of strings. This should improve things for you by a considerable
amount.
P.S: I'm talking with my knowledge on Java Lucene.
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Fri, Apr 23, 2010 at 1:43 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> You have to provide more info, especially the search code you're
> using. How many documents in your index? What are you measuring?
>
> Anything else you can think of that might help people diagnose
> your issue.
>
> Also, consider asking on the .Net user's list.
>
> Known things to look for (in Java).
> 1> Are you re-opening an index reader each time? Don't
> 2> Are you sorting? If so, the first querie(s) will fill internal
>   caches, this takes time. Time subsequent searches.
>
> HTH
> Erick
>
> On Thu, Apr 22, 2010 at 3:58 PM, Ravi Patel <rpatel4@live.com> wrote:
>
> >
> >
> >
> > Using Lucene.Net
> >
> >
> >
> > I've built an index of documents.
> >
> >
> >
> > The documents also have a unique identifier (my identifier, not the
> lucene
> > index's id).
> >
> > The unique identifers are also a sort order of new-ness (higher id values
> > are newer)
> >
> >
> >
> > string my_id ="1234"
> >
> > doc.Add(new Field("id", my_id, Field.Store.YES,
> Field.Index.UN_TOKENIZED));
> >
> >
> >
> > Searching for a particular id, or range searches are incredibly slow
> >
> >
> >
> >
> >
> > TermQuery query = new TermQuery(new Term("id", "1234"));
> >
> > searcher.Search(query)
> >
> >
> >
> >
> >
> > Any tips on how to speed up such an search?
> >
> >
> >
> > I'm also doing RangeSearches on lower / upper ids, and those are slow too
> >
> > _________________________________________________________________
> > The New Busy is not the too busy. Combine all your e-mail accounts with
> > Hotmail.
> >
> >
> http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message