lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksandar Radovanovic <Aleksan...@Radovanovic.com>
Subject Re: [lucy-user] Dictionary based NER with Lucy
Date Fri, 12 Oct 2012 13:27:14 GMT
Thank you Nick. Could you possibly give me some more specific guidelines?

At the moment, all indexed words are "flat" with no semantics - which is
great for general purposes. However, if one focuses on, let's say
biomedical literature, one would like to distinguish what words
represent gene names, drugs names etc.. User would be able to compose
search like "[drug_dictionary_ID] AND headache" to get documents
containing all drug names related to headache. Also, one could group
documents by dictionaries, e.g. group of documents related to genetics
(high frequency of gene/protein names), to diseases (mostly diseases
names), etc..

This could open possibilities for applying machine learning, pattern
analysis or automatic hypothesis generation using not words only but
their semantics as well. All without using unreliable "natural language
processing" algorithms.

Any ideas?

Alex

On 10/12/12 3:01 PM, Nick Wellnhofer wrote:
>
> If I understand your problem description correctly, you could simply
> create another full-text field containing the dictionary IDs related
> to a document separated by whitespace. Then you can search only the
> dictionary field.
>
> Nick
>
>


Mime
View raw message