lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thiago Moreira <tmore...@liferay.com>
Subject Re: Similarity percentage between two Strings
Date Tue, 09 Sep 2008 17:33:48 GMT

    For those interested in my solution I took this article as based to
implement the requirements.

    http://www.catalysoft.com/articles/StrikeAMatch.html

    Thanks.


----- Original Message -----
From: ian.lea@gmail.com
Sent: Thu, September 4, 2008 1:20
Subject:Re: Similarity percentage between two Strings


Googling for "java string similarity" throws up some stuff you might
find useful.


--
Ian.


On Wed, Sep 3, 2008 at 11:58 PM, Thiago Moreira <tmoreira@liferay.com> 
wrote:
>
>     Well, the similar definition that I'm looking for is the number 2,
maybe
> the number 3, but to start the number 2 is enough. If you guys think
that is
> not a Lucene problem what else tool can I use to implement this
> requirement??
>
>     Thanks
> ________________________________
> Thiago Moreira
> Software Engineer
> tmoreira@liferay.com
> Liferay, Inc.
> Enterprise. Open Source. For Life.
>
>
> N. Hira wrote:
>
> I don't know how much of this is a Lucene problem, but -- as I'm sure you
> will inevitably hear from others on the list -- it depends on what your
> definition of "similar" is.
>
> By similar, do you mean:
> 1.  Identical, except for variations in case (upper/lower)
> 2.  Allow 1., but also allow prefixes/suffixes (e.g., "FW:  " or "...
> (summary")
> 3.  Allow 1., 2. and permit some new terms ... how many?
> 4.  Allow all of the above and allow some changes to terms using stemming
> (E.g., "Google releases Chrome" is similar to "Google announces the release
> of its new Chrome web browser")
> ....
>
> I'm sure you see where this is going.  So ... how do you define similar?
>
> Good luck!
>
> -h
> ----------------------------------------------------------------------
> Hira, N.R.
> Cognocys, Inc.
>
> On 03-Sep-2008, at 2:52 PM, Thiago Moreira wrote:
>
>
>     Hey all,
>
>     I want to know how much two Strings are similar! The thing is: I'm
> processing an email box and I want to group all messages that have the
> subject similar, makes sense?? I looked on the documentation but I didn't
> find how to accomplish this. It's not necessary add the messages or the
> subjects on some kind of index. I'm using 2.3.2 version of Lucene.
>
>     Anyone has some idea?
>
>     Thanks in advance.
> --
> Thiago Moreira
> Software Engineer
> tmoreira@liferay.com
> Liferay, Inc.
> Enterprise. Open Source. For Life.



----- End of original message -----


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message