lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananth T. Sarathy" <>
Subject Re: Lucene Seaches VS. Relational database Queries
Date Thu, 13 Apr 2006 17:39:43 GMT
How would that work is some on a form
type Assistant Producer? How would that match the indexed Field if the value
added is Assistant_Producer?

On 4/13/06, Erick Erickson <> wrote:
> Warning: I'm quite new to lucene, so this may not be very accurate....
> What analyzers are you using for indexing and searching? StandardAnalyzer
> (like in most of the examples)? Because it looks like you're having a
> tokenizer problem. That is, when you index "Assistant producer", you
> actually index two tokens, "assistant" and "producer". This is the default
> behavior for the stock analyzers, see page 119 of "Lucene in Action". So,
> when you search your index, you hit on "producer" in "assistant producer".
> A quick test of this would be to munge your input and search terms to
> include an underscore for terms like "assistant producer" (that is, index
> and search for "assistant_producer"), AND use WhitespaceAnalyzer for BOTH
> indexing and searching. This should (if I understand your problem
> correctly), fix the example above. If you use any of the other stock
> analyzers (Simple, Stop or Standard), they'll split your terms at the
> underscore. And if you use one analyzer for indexing and a different one
> for
> searching, the results are, er, interesting and confusing.
> Now, your query select count(distinct Crew_ID) from Crew_TItles where
> Title="Producer" should produce the same results as searching your index
> for
> producer and the Hits object should contain 3 docs.
> P.S. Watch capitalization. the StandardAnalyzer and StopAnalyzer both
> lowercase automatically, but WhitespaceAnalyzer does NOT.
> If that works, then you will probably want to create your own Analyzer
> that
> recognizes your special terms and deals with them as you see fit. This
> works
> well with PerFieldAnalyzerWrapper to allow you to use your own special
> analyzer for one field in your index and other analyzers for other fields,
> as appropriate.
> Like I say, I'm new here, but this is a possibility I thought I'd mention.
> Erick

Ananth T Sarathy

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message