Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 60307 invoked from network); 19 Aug 2010 13:48:58 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 19 Aug 2010 13:48:58 -0000 Received: (qmail 66812 invoked by uid 500); 19 Aug 2010 13:48:56 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 66334 invoked by uid 500); 19 Aug 2010 13:48:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 66317 invoked by uid 99); 19 Aug 2010 13:48:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Aug 2010 13:48:51 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Aug 2010 13:48:47 +0000 Received: by qwk3 with SMTP id 3so2172273qwk.35 for ; Thu, 19 Aug 2010 06:48:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=tpHJD5/7x9NQfs1vdDQMt4E+81XSSRIIud9UlgY0hZM=; b=KBU7E2PDT93edFDPEJMaf3ayWwLhJGVbU8bsw4EhexB+LKpgIClvUtAhYqfZCC71A3 WsxqNHftQQ5QxPL/wVNvZTs+hTu8K2ybYbf6r2QvbCq2SZDAm3xYy/MgIoSJDkG+dsax 2YbenvMxMdTej012mK0cCXgcFVHJouP/wdKgQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=CDqdWxxPHei1YMOP/grpqT4jcApaJLo6rgRjEPGIg3BNlVxJYt82s71Wv+c+tkr4YL e0mC3rKkKwrIMWIlndC14WdITcdiPunIaWU/s6fAPcpCbKM4KkR6kmBQ+/a7Fl6UnrNe d0R9SL3jsc6Pwn6Nm/gDeE8B6CQ0GuHX0Wp+c= MIME-Version: 1.0 Received: by 10.224.36.204 with SMTP id u12mr6488605qad.134.1282225706064; Thu, 19 Aug 2010 06:48:26 -0700 (PDT) Received: by 10.229.186.83 with HTTP; Thu, 19 Aug 2010 06:48:25 -0700 (PDT) In-Reply-To: References: Date: Thu, 19 Aug 2010 09:48:25 -0400 Message-ID: Subject: Re: Sorting a Lucene index From: Erick Erickson To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0015175caad2bb66cf048e2d6f71 --0015175caad2bb66cf048e2d6f71 Content-Type: text/plain; charset=ISO-8859-1 You haven't yet told us how many documents you're talking about here, so it's hard to have a good idea of what solutions are. That said, I'd just try sorting first. The sorting cache size will be something like (sizeof(int or long)) * (number of documents). Measure (remember to measure the response after query warmups, the first few will be slower because they fill up the cache) THEN fix iff there's a problem. And just forget the idea of inserting your documents in the correct order . You stated that the documents come in in random order. Document IDs are assigned internally to Lucene, and monotonically increasing. I sure don't see how you can reconcile those two things... But again, just try it with sorting on the numeric field and only fix things if you have a problem. Lots of work has been put into making Lucene fast, by very bright people. See if they've already solved your problem for you... Best Erick. On Thu, Aug 19, 2010 at 1:51 AM, Shelly_Singh wrote: > Hi Anshum, > > I require sorted results for all my queries and the field on which I need > sorting is fixed; so this lead to me the idea of storing in sorted order to > avoid sorting cost with every query. > > Thanks and Regards, > > Shelly Singh > Center For KNowledge Driven Information Systems, Infosys > Email: shelly_singh@infosys.com > Phone: (M) 91 992 369 7200, (VoIP)2022978622 > > -----Original Message----- > From: Anshum [mailto:anshumg@gmail.com] > Sent: Wednesday, August 18, 2010 5:21 PM > To: java-user@lucene.apache.org > Subject: Re: Sorting a Lucene index > > Hi Shelly, > The search results so returned are sorted either by relevance, index order, > stored field, or custom order. > As you are saying that you would not be able to maintain the index order, > you would have to do the sort at run time. > Sorting on a stored field is not costly and you may use it comfortably. > btw, > are you facing any issues in sort time or is it a presumption? > > -- > Anshum Gupta > http://ai-cafe.blogspot.com > > > On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh >wrote: > > > Hi, > > > > I have a Lucene index that contains a numeric field along with certain > > other fields. The order of incoming documents is random and > un-predictable. > > As a result, while creating an index, I end up adding docs in random > order > > with respect to the numeric field value. > > > > For example, documents may be added in following order: > > 12,y,d > > 100,o,p > > 1,x,y > > 23,u,i > > 31,v,m > > 22,b,m > > 109,k,l > > > > My requirement is that at search time, I want the documents in order of > the > > numeric field. > > One, option is to do a score/sort on the numeric field. > > But, this may be a costly operation. > > > > Hence, I am trying to find if there is some way, such that, my stored > index > > is sorted by itself. > > > > Please help. > > > > Thanks and Regards, > > > > Shelly Singh > > Center For KNowledge Driven Information Systems, Infosys > > Email: shelly_singh@infosys.com > > Phone: (M) 91 992 369 7200, (VoIP)2022978622 > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --0015175caad2bb66cf048e2d6f71--