Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 57940 invoked from network); 24 Apr 2010 14:11:47 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 Apr 2010 14:11:47 -0000 Received: (qmail 96256 invoked by uid 500); 24 Apr 2010 14:11:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 96206 invoked by uid 500); 24 Apr 2010 14:11:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 96198 invoked by uid 99); 24 Apr 2010 14:11:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 24 Apr 2010 14:11:45 +0000 X-ASF-Spam-Status: No, hits=4.7 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of javacache@gmail.com designates 209.85.221.177 as permitted sender) Received: from [209.85.221.177] (HELO mail-qy0-f177.google.com) (209.85.221.177) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 24 Apr 2010 14:11:39 +0000 Received: by qyk7 with SMTP id 7so957986qyk.14 for ; Sat, 24 Apr 2010 07:11:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=4zwrHp0q1hDY2P944HeQ7oUnIWMuITBuNsME5TBh8eY=; b=B+iUJJ1+gWsgShy8WaNn79v8O4EZUUtUhMIrb/KZjKJCEQ1FTF7D2SBGgL9+th1w+N PR2UarWe2Fd3syBtD+nNBXPOsevgEbSJ5q3eQfOwNLpTt3GR5VJpCi2xpWzVqP49l7m7 +YtlwElq5P9aF/vmupOpSuUNmLOsCqS5gvcXA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=JvjjdcWEKXvs5zYFbU+e3nJidT0Ubmq8wCCIQEpThOBEKvh5F2CR5kdfQh08YOlyVD ln4E6nWfGdam2EJIE2JquGPllLr4MAjhb605tXSznZmcy1ZkBYlz8xLmxNCB/18QvT+m o1Cvh70SRPnGd2d/CWYU382wbtkHnpU7E+hFU= MIME-Version: 1.0 Received: by 10.229.251.69 with SMTP id mr5mr1663484qcb.91.1272118276452; Sat, 24 Apr 2010 07:11:16 -0700 (PDT) Received: by 10.229.99.199 with HTTP; Sat, 24 Apr 2010 07:11:16 -0700 (PDT) In-Reply-To: References: Date: Sat, 24 Apr 2010 22:11:16 +0800 Message-ID: Subject: Re: Indexing and Searching fields that have unique values From: Ivan Liu To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016363b9120faf34e0484fc1dc3 X-Virus-Checked: Checked by ClamAV on apache.org --0016363b9120faf34e0484fc1dc3 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable I think Anshum is right=A1=A3 And may your range is too big and is sorting 2010/4/23 Anshum > Hi Ravi, > > Adding to what Erick said, you could do index the numbers as numeric fiel= ds > instead of strings. This should improve things for you by a considerable > amount. > P.S: I'm talking with my knowledge on Java Lucene. > -- > Anshum Gupta > Naukri Labs! > http://ai-cafe.blogspot.com > > The facts expressed here belong to everybody, the opinions to me. The > distinction is yours to draw............ > > > On Fri, Apr 23, 2010 at 1:43 AM, Erick Erickson >wrote: > > > You have to provide more info, especially the search code you're > > using. How many documents in your index? What are you measuring? > > > > Anything else you can think of that might help people diagnose > > your issue. > > > > Also, consider asking on the .Net user's list. > > > > Known things to look for (in Java). > > 1> Are you re-opening an index reader each time? Don't > > 2> Are you sorting? If so, the first querie(s) will fill internal > > caches, this takes time. Time subsequent searches. > > > > HTH > > Erick > > > > On Thu, Apr 22, 2010 at 3:58 PM, Ravi Patel wrote: > > > > > > > > > > > > > > Using Lucene.Net > > > > > > > > > > > > I've built an index of documents. > > > > > > > > > > > > The documents also have a unique identifier (my identifier, not the > > lucene > > > index's id). > > > > > > The unique identifers are also a sort order of new-ness (higher id > values > > > are newer) > > > > > > > > > > > > string my_id =3D"1234" > > > > > > doc.Add(new Field("id", my_id, Field.Store.YES, > > Field.Index.UN_TOKENIZED)); > > > > > > > > > > > > Searching for a particular id, or range searches are incredibly slow > > > > > > > > > > > > > > > > > > TermQuery query =3D new TermQuery(new Term("id", "1234")); > > > > > > searcher.Search(query) > > > > > > > > > > > > > > > > > > Any tips on how to speed up such an search? > > > > > > > > > > > > I'm also doing RangeSearches on lower / upper ids, and those are slow > too > > > > > > _________________________________________________________________ > > > The New Busy is not the too busy. Combine all your e-mail accounts wi= th > > > Hotmail. > > > > > > > > > http://www.windowslive.com/campaign/thenewbusy?tile=3Dmultiaccount&ocid= =3DPID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 > > > --=20 =B3=E5=C0=CB=B0=E5 my blog:=B3=E5=C0=CB=B0=E5 my site:Keji Technology --0016363b9120faf34e0484fc1dc3--