lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject Re: Lucene equivalent of SQL DISTINCT for a specific field's "stored values"
Date Fri, 27 Jul 2007 02:55:00 GMT
On Friday 27 July 2007 12:50:12 TimF wrote:
> However, obviously this returns the list of distinct terms,
>    Hello , World , Goodbye , Foo , Bar , Mad
>
> not the list of distinct stored values,
>    Hello World , Goodbye World , Foo Bar , Mad Mad Mad Mad World
>
> I could add another field to the index that is not tokenized and then
> enumerate the terms for that new field, but this seems like a hack, and it
> would also add size to the index in that I would be duplicating data for
> the category for each document.

That is certainly how I would do it.  This sort of thing only works with 
untokenised fields, unless you have somewhere else you can store the 
untokenised version which is quicker to iterate over.

Daniel


-- 
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message