lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Postcode/zipcode search
Date Tue, 06 May 2008 16:39:49 GMT
Have you looked at PrefixQuery? If that doesn't work for you, could you give
a few
more examples of expected inputs and outputs?

Best
Erick

On Tue, May 6, 2008 at 12:28 PM, Chris Mannion <chris.mannion@nonstopgov.com>
wrote:

> Hi all
>
> I've got a bit of a niggling problem with how one of my searches is
> working
> as opposed to how my users would like it too work.  We're indexing on UK
> postcodes, which are in the format of a 3 or 4 character area code
> followed
> by a 3 or 4 character street specific code, e.g. "NW10 7NY" or "M11 1LQ".
> We originally had the values being indexed as tokenized and used a very
> simple search string in the format "postcode:xxx xxx", with no grouping or
> boosting or fuzzy searching, just an straight search on whatever the user
> answered.  This had the benefit of finding exact matches to searches and
> allowing us to search just on the area part of the code to return all
> records with that area code, eg a search on "NW2" returning anything
> starting NW2, like "NW2 6TB", "NW2 1ER" etc etc.
>
> However, the downside to that was that searches could also return records
> only tenuously related to what was searched for, eg. a search for "NW10
> 7NY"
> would also return a record with a postcode "SE9 6NY" because of the slight
> match of the "NY".  Obviously this was technically correct but users
> complained because their searches were returning records from completely
> different areas.  Our first step to put this right was to take off the
> tokenization of the field, which we also weren't happy with so have
> continued to fiddle.
>
> The current status is as follows - we index the values by stripping out
> spaces and tokeniing them and use a keywordAnalyzer.  In searching we also
> strip spaces from the search term entered and search with a
> keywordAnalyzer.  Searches for full postcodes, e.g. "NW10 7NY" find all
> exact matches but also any full values that are partial matches (e.g. some
> records just have "NW10" as their postcode field and the "NW10 7NY" search
> pulls them back too), but searches for partial postcodes e.g. "NW10" still
> only finds exact matches, e.g. it only pulls back those record that have
> just "NW10" as their postcode, rather than anything *starting* with NW10
> as
> we'd like it to do.
>
> Can anyone help me get this working in the way we need it too please?
>
> --
> Chris Mannion
> iCasework and LocalAlert implementation team
> 0208 144 4416
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message