lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Using Lucene for user query parsing
Date Fri, 06 Mar 2009 13:45:37 GMT
Whatever you do will be wrong <G>. What you're saying is
that you have structured data that the user wants to search
in an unstructured way, and you want to try to create a
system that intuits what the user meant. Good luck <G>.

Can you back up a bit and talk about the problem you're
trying to solve? If, for instance, you're trying to find the
best match for a particular business, one approach would
be to create one index where each business had


where the field bagowords contained a copy of the data
from the other three fields, then search bagowords
for your query terms. It sounds simplistic, but it might be
surprisingly good.

And if this is out in left field, a higher level statement
of the problem would help get better answers.


On Fri, Mar 6, 2009 at 1:25 AM, Srinivas Bharghav

> I am trying to evaluate as to whether Lucene is the right candidate for the
> problem at hand.
> Say I have 3 indexes:
> Index 1 has street names.
> Index 2 has business names.
> Index 3 has area names.
> All these names can be single words or a combination of words like woodward
> street or marks and spencers street etc etc.
> Now the use enters a query saying "mc donalds woodward street kingston
> precinct".
> I have to parse this query and come up with the best match possible. The
> problem is, in the query I do not know which part is the business name or
> area name or street name. Also the user may give the query in any order for
> example he may give it as "kingston precinct mc donalds woodward street".
> There might be spelling mistkaes in the query enterd by the user. Also he
> might use road for street or lane for street and such things. I know that
> Lucene is the right candidate for the synonym and spelling mistakes part
> but
> am a bit hazy regarding the user query parsing part as to in which index to
> search what. Any help is greatly appreciated.
> Thanks,
> Srini.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message