lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@hatcher.net>
Subject Re: Keyword query confusion
Date Sat, 25 Sep 2004 09:59:26 GMT
On Sep 24, 2004, at 12:26 PM, Fred Toth wrote:
> I'm trying to understand what's going on with the query parser
> and keyword fields.

It's a confusing situation, for sure.

> I've got a large subset of my documents which are "publications".
> So as to be able to query these, I've got this in the indexer:
>
> doc.add(Field.Keyword("is_pub", "1"));
>
> However, if I run a query:
>
> 	is_pub:1
>
> I get no hits. If I find a document by other means and dump the
> fields, the "is_pub" keyword is there, with value of "1".

As already stated - it is the analyzer eating the "1".  Every field is 
analyzed by QueryParser, but during indexing Field.Keyword fields are 
not indexed.

Search the archives for discussion on a KeywordAnalyzer and how to use 
it with PerFieldAnalyzerWrapper.  Also, the info here is valuable:

	http://wiki.apache.org/jakarta-lucene/AnalysisParalysis

Visualizing what an analyzer does and using Query.toString are both 
techniques to clearly point out what is happening.

> Now, I've learned that if I change the field to contain the value 
> "true"
> instead of the string "1", this query:
>
> 	is_pub:true
>
> works just fine.
>
> So, I'm pretty sure I'm running afoul of the analyzer, right? The doc 
> says
> specifically that I should add keyword query clauses programmatically,
> and I'm guessing that's what's wrong.

It really depends on your needs.  I personally wouldn't want end-users 
knowing to type "is_pub:true" into a search box.  Designing the most 
appropriate search interface for your situation is highly recommended.  
And in this case a checkbox for "Is published?" that translates into a 
TermQuery behind the scenes (likely combined with other pieces, perhaps 
a QueryParser parsed piece, using BooleanQuery).  TermQuery text is not 
analyzed, so you'd be safe there.

> But can someone explain this? It sure is useful to be able to test this
> sort of thing with the query parser. What is going on with the standard
> analyzer that makes "true" work and "1" not work?

Numbers get axed, that is what happens.

> Is there a way around this other than by writing code to create the
> query? This also applies to other types of query, like "pub_date:2004".

A PerFieldAnalyzerWrapper using WhitespaceAnalyzer for the "is_pub" 
field would do the trick in this case.

Again, users typing "pub_date:2004" seems awkward to me - make a year 
drop-down box if they need to select a year.

> Hoping for enlightenment...

Now that's a tall order... or is it?!  It's surrounding us all - we 
simply have to breath it in.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message