cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [OT] Determining the similarity between a pair of texts
Date Wed, 15 Jun 2005 16:27:29 GMT
Peter Hunsberger wrote:
> On 6/15/05, Ugo Cei <> wrote:
>>Il giorno 15/giu/05, alle 16:32, Tony Collen ha scritto:
> <snip/>
>>Actually, what I am trying to come up is an algorithm for determining
>>whether two texts refer (more or less) about similar subjects.
> Eee, then you may have to jump into the NLP stuff and Ontology
> processing in general; non-trivial stuff...

I've been working on this for the past few months. There is no clearcut
solution, but using LSI is probably the best approach for the above
(even if there is tons of literature on variations of approaches).

As for string distance, you might want to check out


View raw message