lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: literal search in quotes on non-tokenized field
Date Tue, 30 Nov 2004 21:51:26 GMT
On Nov 30, 2004, at 4:42 PM, Allen Atamer wrote:
> A search in quotes on a field named Build with the query "\"orig\"" 
> does not
> work but the query "origi" yields 62 hits
> I have run indexing on the field with the following method
>                     doc.add(Field.Keyword(data.getColumnName(j),
>                             fieldValue.toString().toLowerCase()));
> so even though the original data has "ORIGI" in the "Build" field, 
> lowercase
> is not the problem
> Here's a log of the parsed query before going to the searcher:
> Parsed query: (Build:"origi") for the first search
> Parsed query: (Build:origi) for the second search

What do you mean by "parsed", since below you say you're not using 

> Right now we're not using a query parser / analyzer system to build the
> query. We're building the query up.
> The query mentioned above is a TermQuery object

Let me hopefully clarify what you've said.... you've indexed (I'm not 
using quotes on purpose) origi, but you're doing a TermQuery on "origi" 
(with the quotes) and expecting it to match?

It doesn't work that way.  A TermQuery must match *exactly* what was 
indexed (either directly as a Keyword, or as tokens emitted from the 
analyzer).  Since you're building the query up yourself from, I'm 
assuming, user input, you may need to pre-process what the user entered 
to get the right term to query on.  Only the term origi would match.

I'm in the process of building a search system for some library 
archives and the single hardest thing about my work is building the 
user interface and making it work like users would like.  Humanities 
academics are by far the most particular crowd ever.  My suggestion of 
making a single text box that they type words into to query which I 
would lowercase and tokenize into a Boolean OR query was shot down 
quickly as being insufficient.   :)  Case-sensitive and insensitive 
searches are desired with all sort of other accommodations.   In fact, 
they suggested I write an article on my experience of building a search 
interface.  The underlying indexing and searching is trivial compared 
to the interface!


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message