opennlp-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim - FooBar();" <>
Subject Re: Post Address Parsing and OpenNLP
Date Fri, 20 Apr 2012 13:29:57 GMT
On 20/04/12 14:16, mauro fraboni wrote:
> I am investigating if it is possible to use OpenNLP to parse italian post
> addresses.
> I do not want to validate the input address using an official address
> database; I just need to divide a single address string into its individual
> component parts and I thought to use NameFinder.
> My idea was to train Name Finder using some italian addresses indicating in
> training data the parts like Street, Town, Province, Post Code, Country
> Do you think that it can work? Someone has experience about it?
> Thanks and ciao.

Hmmm, that sounds like it should work....however you don't want to 
separate your entities to Street, Town, Province, Post Code, Country etc 
cos then how are you going to join them to get your 'real' entity 
(address)? I would say keep the whole address as 1 entity and produce 
some training data that mark the whole thing...of course if you already 
have some training is better otherwise you will spend a bit of time 
creating your annotated corpus...

My logic says that this is the way to go - maybe I'm wrong is some way....
Any different opinions anyone?


ps. In your first sentence did you by any chance mean to say "recognise" 
instead of "parse"?

View raw message