lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: how to search for starts with multiple words in lucene
Date Wed, 26 Nov 2008 13:46:42 GMT
Your problem here is probably tokenization at query time.

Queries like 110_a:library a* would search field 110_a for
library and your default field for a*. You might try
+110a_:library +110a_:a*, but I doubt that's really
what you want since there's no guarantee that the terms
will be next to each other.

Note that phrase queries l don't go through the wildcard
parsers, so searching for "library a*" in quotes) won't do
what you want.

You might want to look at the SpanQuery family. It's unclear
whether you'd expect a hit on something that started in the
middle, but you can get around this by adding a synthetic
token at the start of each field you index then adding that to
each query. Something like:
doc.add("field", "$ <original text>", blah blah)
at index time and then add the "$" (or whatever) at query time
if you require that the match never hit in the middle.

Another possibility would be to index using something like
KeywordAnalyzer, but this assumes that you never want
to search for anything in that field that starts in the middle.

Best
Erick

On Wed, Nov 26, 2008 at 4:48 AM, naveen.a <naveen.verus@gmail.com> wrote:

>
> Hi,
>
> Below is a document in lucene
> ---------------------------------------------
> Field   Value
> ---------------------------------------------
> ID:1
> 110_a:library and information
> ---------------------------------------------
> I need to search for starts with logic, below are the search cases for the
> above document
>
>
> ------------------------------------------------------------------------------
> Query                                             Result
>
> ------------------------------------------------------------------------------
> 110_a:l*                                           ID - 1
> 110_a:library*                                   ID - 1
> 110_a:library *                                  No Results
> 110_a:library a*                                No Results
> 110_a:"library a*"                              No Results
>
> ------------------------------------------------------------------------------
> here, if i apply single word for starts with search, it is found,
> but if i add any space after the first word, it is not found
>
> so, how to apply the query to search for starts with multiple words
> --
> View this message in context:
> http://www.nabble.com/how-to-search-for-starts-with-multiple-words-in-lucene-tp20697741p20697741.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message