Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 72247 invoked from network); 4 Aug 2007 03:14:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Aug 2007 03:14:21 -0000 Received: (qmail 33967 invoked by uid 500); 4 Aug 2007 03:14:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 33927 invoked by uid 500); 4 Aug 2007 03:14:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 33911 invoked by uid 99); 4 Aug 2007 03:14:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2007 20:14:15 -0700 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mike.klaas@gmail.com designates 209.85.146.181 as permitted sender) Received: from [209.85.146.181] (HELO wa-out-1112.google.com) (209.85.146.181) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Aug 2007 03:14:08 +0000 Received: by wa-out-1112.google.com with SMTP id j40so1102373wah for ; Fri, 03 Aug 2007 20:13:48 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; b=nvntCMKbu6nFuua8KFZopY9VAZ7EzHupBs4BJbXFyQ+SVCcPAVBNqbvuQF9O+zCf6CDf9DLUJrrptc+WrrGduUvQ5wDbiZhbkq7LSWMIsjoEW0NRzbktPjZWXDVU8d0m4GoRYBQqQAaRE1zSrGvPftYAzkynCUYuC8r2lLPbLgY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:mime-version:in-reply-to:references:content-type:message-id:content-transfer-encoding:from:subject:date:to:x-mailer; b=R9AtqyXFCHU4Mu2ORZQz9kXgUZIdIPwJJDrjVGTS3Rz7+QyPtYNjId/EWwU4ATA9hwr2iOcybc1NOYkS4rdmsauxgvAE8crJr2CVQIXleqxb3Nv7dmNEUjp5knkRkfmXS7Tu1boyfWgRDqq9mwgHMTJsPrOTWRcv4Jf/2KdnlB0= Received: by 10.114.120.1 with SMTP id s1mr3559065wac.1186197227832; Fri, 03 Aug 2007 20:13:47 -0700 (PDT) Received: from ?192.168.1.103? ( [24.82.155.191]) by mx.google.com with ESMTPS id j21sm5568703wah.2007.08.03.20.13.44 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 03 Aug 2007 20:13:45 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <46B279C8.8030300@gmail.com> References: <11960750.post@talk.nabble.com> <34b8543c0708020200s82acc82oe9c2e360d527f2db@mail.gmail.com> <11961159.post@talk.nabble.com> <34b8543c0708020225x435f0ad6s4869cd739107bc66@mail.gmail.com> <11962465.post@talk.nabble.com> <46B279C8.8030300@gmail.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <567A8ED8-D942-4598-8C9E-9EB5C3F85DB7@gmail.com> Content-Transfer-Encoding: 7bit From: Mike Klaas Subject: Re: Getting only the Ids, not the whole documents. Date: Fri, 3 Aug 2007 20:13:41 -0700 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org You still have a disk seek per doc if the index can't fit in memory (usually more costly than reading the fields) . Why not use FieldCache? -Mike On 2-Aug-07, at 5:41 PM, Mark Miller wrote: > If you are just retrieving your custom id and you have more stored > fields (and they are not tiny) you certainly do want to use a field > selector. I would suggest SetBasedFieldSelector. > > - Mark > > testn wrote: >> Hi, >> >> Why don't you consider to use FieldSelector? >> LoadFirstFieldSelector has an >> ability to help you load only the first field in the document without >> loading all the fields. After that, you can keep the whole >> document if you >> like. It should help improve performance better. >> >> >> >> is_maximum wrote: >> >>> yes it decrease the performance but the only solution. >>> I've spent many weeks to find best way to retrive my own IDs but >>> find this >>> way as last one >>> >>> now I am storing the ids in a BitSet structure and it's fast enough >>> >>> public void collect(...){ >>> idBitSet.set(Integer.valueOf(searcher.doc(id).get("MyOwnID"))); >>> >>> } >>> >>> On 8/2/07, makkhar wrote: >>> >>>> >>>> Hi, >>>> >>>> The solution you suggested will definitely work but will >>>> definitely >>>> slow >>>> down my search by an order of magnitude. The problem I am trying >>>> to solve >>>> is >>>> performance, thats why I need the collection of IDs and not the >>>> whole >>>> documents. >>>> >>>> - thanks for the prompt reply. >>>> >>>> >>>> is_maximum wrote: >>>> >>>>> yes if you extend your class from HitCollector and override the >>>>> >>>> collect() >>>> >>>>> mthod with following signature you can get IDs >>>>> >>>>> public void collect(int id, float score) >>>>> >>>>> On 8/2/07, makkhar wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Can I get just a list of document Ids given a search >>>>>> criteria ? To >>>>>> elaborate here is my situation: >>>>>> >>>>>> I store 20000 contracts in the file system index each with some >>>>>> parameterName and Value. Given a search criterion - >>>>>> >>>> (paramValue='draft'). >>>> >>>>>> I >>>>>> need to get just an ArrayList of Strings containing contract >>>>>> Ids. I >>>>>> >>>> dont >>>> >>>>>> need the lucene documents, just the Ids. >>>>>> >>>>>> Can this be done ? >>>>>> >>>>>> -thanks >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> >>>>>> >>>> http://www.nabble.com/Getting-only-the-Ids%2C-not-the-whole- >>>> documents.-tf4204907.html#a11960750 >>>> >>>>>> Sent from the Lucene - Java Users mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> >>>>>> ----------------------------------------------------------------- >>>>>> ---- >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>>> >>>>>> >>>>>> >>>>> -- >>>>> Regards, >>>>> Mohammad >>>>> -------------------------- >>>>> see my blog: http://brainable.blogspot.com/ >>>>> another in Persian: http://fekre-motefavet.blogspot.com/ >>>>> >>>>> >>>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/Getting-only-the-Ids%2C-not-the-whole- >>>> documents.-tf4204907.html#a11961159 >>>> Sent from the Lucene - Java Users mailing list archive at >>>> Nabble.com. >>>> >>>> >>>> ------------------------------------------------------------------- >>>> -- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>> >>>> >>>> >>> -- >>> Regards, >>> Mohammad >>> -------------------------- >>> see my blog: http://brainable.blogspot.com/ >>> another in Persian: http://fekre-motefavet.blogspot.com/ >>> >>> >>> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org