lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cool The Breezer <techcool.ku...@yahoo.com>
Subject Similarity
Date Tue, 23 Jun 2009 09:27:39 GMT

Of the late I started using Lucene as main search library for all documents in our intranet.
It works extremely well. I am trying to use similarity kinda functionality to find similarity
between two sentences/documents and trying to use Wordnet in our searching solution. I have
used wordnet contrib package and it really works well to expand queries with synonyms and
get results. But I can get handicap when searching for documents with query like "Steve Jobs"
and documents containing "apple" should be returned. In the same way "pirated" and "willfull
downloading copyrighted material". This comes finding meaning of a word wrt its context. Has
anybody done any kind of such context based indexing that means while tokenization based on
context of each word/token and searching the same after expanding the query using synonyms.
I have come across some sf projects like http://wn-similarity.sourceforge.net/  to semantically
relating words using wordnet but I am
 still kinda confused on how to move ahead with such kind of context based search. Appreciate
your help. I understand that this might not be directly related to Lucene but somehow this
falls in the same domain search solution. 

- RB


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message