Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 50295 invoked from network); 23 Feb 2007 21:40:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Feb 2007 21:40:19 -0000 Received: (qmail 4700 invoked by uid 500); 23 Feb 2007 21:40:16 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 4674 invoked by uid 500); 23 Feb 2007 21:40:16 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 4651 invoked by uid 99); 23 Feb 2007 21:40:16 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Feb 2007 13:40:16 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 209.85.132.244 as permitted sender) Received: from [209.85.132.244] (HELO an-out-0708.google.com) (209.85.132.244) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Feb 2007 13:40:05 -0800 Received: by an-out-0708.google.com with SMTP id c3so441865ana for ; Fri, 23 Feb 2007 13:39:44 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ig8LmqUd5Kix2dszMnO3nYs1AUo+mRDxHK2Xh9dtO3Na77ZVH4Cwxp3VCG1tr+S4iOl1tlDh6ktrGBz12oABsolqaqO0+k0ZYhogxYY+MylI5NB1Hsa2q6MPXxm/0FzCLrNow/MiLKHlQhewFEdHHVS2cQXGscvtNXb4vXpP0WA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=ZZZ7b/mS3Lz9u6/Wh/dk8pt8J2yThg3f3cYsT7Q5Tr+3QGC4MCAmkaFl4FUOD4jYDJNAyXb/77zhxPOZ77zLCUM4ENBTwWAbM9/lEf5AMp0wy1cxqYb7oFNLuUqxrsKI7uqespy921o4INoCuOnZvdxsS+Vf51RVBfSF+vIC+nI= Received: by 10.114.12.9 with SMTP id 9mr1241717wal.1172266783184; Fri, 23 Feb 2007 13:39:43 -0800 (PST) Received: by 10.114.58.3 with HTTP; Fri, 23 Feb 2007 13:39:43 -0800 (PST) Message-ID: <359a92830702231339i605ab4e7i8a70658c1e9eda2f@mail.gmail.com> Date: Fri, 23 Feb 2007 16:39:43 -0500 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: filtering by first letter In-Reply-To: <45DF4DD0.7010105@tkz.net> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_68069_27782855.1172266783117" References: <45DF4DD0.7010105@tkz.net> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_68069_27782855.1172266783117 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline See TermEnum (I don't think you need TermDocs for this). If you instantiate a TermEnum(new Term("firstletterfield", "")), it'll enumerate all the terms in your 'firstletter' field and you can just collect them and go... For that matter, and assuming that your names are UN_TOKENIZED, you could do something like this without a special field by iterating over your personName field. This might be reasonable if your index is fairly static and you could create this list at IndexReader open time, especially since you can use TermEnum.skipTo("personName", "a") etc..... Best Erick On 2/23/07, Paul Sundling (Webdaddy) wrote: > > I have a requirement to support filtering search results by first letter. > > This is relatively simple by adding a field to each index that > represents the first letter for that relevant index and then adding a > filter to the search. > > The hard part is that I need to list all the letters you can filter BY. > So if there are no names that start with S, it shouldn't appear as an > option. > > Is there a simple and performant way to get a set of all the unique > values for a Field in the Hits returned? There would probably only be > low number of unique values. > > So let's say I have the following in my index: > > letter, personName > m, mike smith > p, paul smith > g, george smith > g, glenda smith > > I need to be able to display to the user that they can filter based on > M, P or G within their search for George. > > I could do a compromise and for search results above a certain level, > show all letters and numbers, but it won't always give correct values. > Imagine this edge case: A search for george has 50,000 results, but only > a couple people had george as their last name. Not many of the letters > would be valid filters. > > Thanks for any ideas or approaches I overlooked. > > Paul Sundling > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_68069_27782855.1172266783117--