Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 36114 invoked from network); 27 Jul 2007 02:50:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Jul 2007 02:50:45 -0000 Received: (qmail 15644 invoked by uid 500); 27 Jul 2007 02:50:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 15584 invoked by uid 500); 27 Jul 2007 02:50:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 15570 invoked by uid 99); 27 Jul 2007 02:50:39 -0000 Received: from Unknown (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jul 2007 19:50:39 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jul 2007 02:50:33 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1IEFu8-00086k-Sx for java-user@lucene.apache.org; Thu, 26 Jul 2007 19:50:12 -0700 Message-ID: <11822265.post@talk.nabble.com> Date: Thu, 26 Jul 2007 19:50:12 -0700 (PDT) From: TimF To: java-user@lucene.apache.org Subject: Lucene equivalent of SQL DISTINCT for a specific field's "stored values" MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: tim@timflanders.com X-Virus-Checked: Checked by ClamAV on apache.org I have a field called "category". Sample data for "category: Hello World Goodbye World Foo Bar Mad Mad Mad Mad World It is tokenized and stored in the index. I tokenize the field because I may want to search on a specific word(s) in a category but not necessarily the entire category. However, I also would like to offer a select box in my web application that gives the end user the distinct list of stored values for the category field, which they could choose one of to search on. I have tried what most people recommend in this forum, use IndexReader.terms("cateogry") and enumerate that list. However, obviously this returns the list of distinct terms, Hello , World , Goodbye , Foo , Bar , Mad not the list of distinct stored values, Hello World , Goodbye World , Foo Bar , Mad Mad Mad Mad World I could add another field to the index that is not tokenized and then enumerate the terms for that new field, but this seems like a hack, and it would also add size to the index in that I would be duplicating data for the category for each document. Any other ideas? Thanks, Tim -- View this message in context: http://www.nabble.com/Lucene-equivalent-of-SQL-DISTINCT-for-a-specific-field%27s-%22stored-values%22-tf4155152.html#a11822265 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org