lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Fri, 14 Nov 2003 22:53:58 GMT
Well ... Sure, nothing can replace a human mind. But believe it or not, 
there are studies which show that even human experts can significantly 
differ in their opinions on what are key-phrases for a given text. So, 
the results are never clear cut with humans either...

So, in this sense a heuristic tool for sentence splitting and key-phrase 
detection can go long ways. For example, the application I mentioned, 
uses quite a few heuristic rules (+ Markov chains as a heavier 
ammunition :-), and it comes up with the following phrases for your 
email discussion (the text quoted below):

(lang=EN): NLP, trainable rule-based tagging, natural language 
processing, apache, NLP expert

Now, this set of key-phrases does reflect the main noun-phrases in the 
text... which means I have a practical and tangible benefit from NLP. 
QED ;-)

Best regards,
Andrzej

petite_abeille wrote:
> 
> On Nov 14, 2003, at 20:29, Philippe Laflamme wrote:
> 
>>> Rules of linguistics? Is there such a thing? :)
>>
>>
>> Actually, yes there is. Natural Language Processing (NLP) is a very broad
>> research subject but a lot has come out of it.
> 
> 
> A lot of what? "If" statements? :)
> 
>> More specifically, Rule-based taggers have become very popular since Eric
>> Brill published his works on trainable rule-based tagging.
>>
>> Essentially, it comes to down analysing sentences to determine the role
>> (noun, verb, etc.) of each words. It's very helpful to extract 
>> noun-phrases
>> such has "cardiovascular disease" or "magnetic resonance imaging" from
>> documents.
> 
> 
> I would agree with that. But it's easier said than done. And the result 
> are never, er, clear cut.
> 
>> So, yep... you can definitely derive rules to analyse natural language...
> 
> 
> Well... beyond the jargon and the impressive math... this all boils down 
> to fuzzy heuristics and judgment calls... but perhaps this is just me :)
> 
>> I'm sure you already know about all of this...
> 
> 
> Not really. I'm more of a dilettante than a "NLP expert".
> 
>> just thought it might be
>> interesting for some...
> 
> 
> Sure. But my take on this, is that pigs will fly before NLP turns into a 
> predictable "science" :)
> 
> PA.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message