lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Holsman (Lists)" <li...@holsman.net>
Subject Re: Postcode/zipcode search
Date Sat, 24 May 2008 09:15:38 GMT
have you had a look at WOEID's ?
https://developer.yahoo.com/geo/


http://where.yahooapis.com/v1/places.q('NW10%207NY')
gives you details about the postcode, as well as a lat/long bounding box 
and the 'real' name of it (Willesden) in this case.

http://where.yahooapis.com/v1/place/26556102/neighbors

gives you the neighbors to it
http://where.yahooapis.com/v1/place/26556102/siblings
gives you it's children.
and
http://where.yahooapis.com/v1/place/26556102/parent?select=long
gives you 1 level up. (NW2 4) apparently.


So I'm guessing you could use 2 calls. 1 to get the WOEID of what the 
user has entered. the 2nd to get the siblings. using that you can 
construct a query to get all the entries in NW10 7NY.


(note: I don't work for yahoo, but work with people who used to)

mark harwood wrote:
> Can you not convert all postcodes to coordinates and do actual distance-based matching?
> 
> You will have to pay Royal Mail or 3rd party suppliers to get hold of the PAF data required
for this geocoding (despite having funded this already as a UK tax payer- grrrr)
> 
> Cheers
> Mark
> 
> ----- Original Message ----
> From: Chris Mannion <chris.mannion@nonstopgov.com>
> To: java-user@lucene.apache.org
> Sent: Tuesday, 6 May, 2008 5:28:25 PM
> Subject: Postcode/zipcode search
> 
> Hi all
> 
> I've got a bit of a niggling problem with how one of my searches is working
> as opposed to how my users would like it too work.  We're indexing on UK
> postcodes, which are in the format of a 3 or 4 character area code followed
> by a 3 or 4 character street specific code, e.g. "NW10 7NY" or "M11 1LQ".
> We originally had the values being indexed as tokenized and used a very
> simple search string in the format "postcode:xxx xxx", with no grouping or
> boosting or fuzzy searching, just an straight search on whatever the user
> answered.  This had the benefit of finding exact matches to searches and
> allowing us to search just on the area part of the code to return all
> records with that area code, eg a search on "NW2" returning anything
> starting NW2, like "NW2 6TB", "NW2 1ER" etc etc.
> 
> However, the downside to that was that searches could also return records
> only tenuously related to what was searched for, eg. a search for "NW10 7NY"
> would also return a record with a postcode "SE9 6NY" because of the slight
> match of the "NY".  Obviously this was technically correct but users
> complained because their searches were returning records from completely
> different areas.  Our first step to put this right was to take off the
> tokenization of the field, which we also weren't happy with so have
> continued to fiddle.
> 
> The current status is as follows - we index the values by stripping out
> spaces and tokeniing them and use a keywordAnalyzer.  In searching we also
> strip spaces from the search term entered and search with a
> keywordAnalyzer.  Searches for full postcodes, e.g. "NW10 7NY" find all
> exact matches but also any full values that are partial matches (e.g. some
> records just have "NW10" as their postcode field and the "NW10 7NY" search
> pulls them back too), but searches for partial postcodes e.g. "NW10" still
> only finds exact matches, e.g. it only pulls back those record that have
> just "NW10" as their postcode, rather than anything *starting* with NW10 as
> we'd like it to do.
> 
> Can anyone help me get this working in the way we need it too please?
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message