lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JMA" <mrj...@comcast.net>
Subject Frustrated with tokenized listing terms
Date Mon, 24 Oct 2005 08:46:01 GMT

Greetings...
Quick question, perhaps I am missing something.

I have a bunch of documents where one of the indexed fields is "author". For
example:

book1, by "John Smith"
book2, by "Steve Smith"
book3, by "John Smith"

I would like to find all distinct authors in my index.  I want to support
searches for author:smith, so I tokenize the author field during index.
However, getTerms() then returns:

John (x2)
Smith (x3)
Steve (x1)

I would like to see:
John Smith (x2)
Steve Smith (x1)

I've solved this by indexing the field twice, once as author:(searchable/not
stored/tokenized)
and once as author_phrased:(not searchable/stored/not tokenized).

Then I query using the 'author' field while listing terms using the
'author_phrased' field.

This works, but is it the proper way to do it?

Thanks in advance,

JMA



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message