lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TimF <...@timflanders.com>
Subject Lucene equivalent of SQL DISTINCT for a specific field's "stored values"
Date Fri, 27 Jul 2007 02:50:12 GMT

I have a field called "category".
Sample data for "category:
   Hello World
   Goodbye World
   Foo Bar
   Mad Mad Mad Mad World

It is tokenized and stored in the index. I tokenize the field because I may
want to search on a specific word(s) in a category but not necessarily the
entire category.

However, I also would like to offer a select box in my web application that
gives the end user the distinct list of stored values for the category
field, which they could choose one of to search on.

I have tried what most people recommend in this forum, use
IndexReader.terms("cateogry") and enumerate that list.

However, obviously this returns the list of distinct terms, 
   Hello , World , Goodbye , Foo , Bar , Mad

not the list of distinct stored values,
   Hello World , Goodbye World , Foo Bar , Mad Mad Mad Mad World

I could add another field to the index that is not tokenized and then
enumerate the terms for that new field, but this seems like a hack, and it
would also add size to the index in that I would be duplicating data for the
category for each document.

Any other ideas?
Thanks,
Tim
-- 
View this message in context: http://www.nabble.com/Lucene-equivalent-of-SQL-DISTINCT-for-a-specific-field%27s-%22stored-values%22-tf4155152.html#a11822265
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message