commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lee (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (TEXT-129) incorrect result from JaroWinklerDistance(计算不正确)
Date Tue, 31 Jul 2018 05:25:00 GMT

     [ https://issues.apache.org/jira/browse/TEXT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lee updated TEXT-129:
---------------------------
    Description: 
JaroWinklerDistance resolves 0 similariy between "_trump_" and "_donald trump_"

scala code here:

 scala> val jw=new JaroWinklerDistance

scala> jw("*trump*","*donald trump*")  // *INCORRECT*
 *res1: Double = 0.0*

scala> jw("ivanka trump","donald trump")  // correct
 res2: Double = 0.736111111111111

scala> jw(" trump","trump") // correct result; there's a leading space in first string
 res13: Double = 0.9444444444444445

scala> jw("a trump","trump")  // correct
 res14: Double = 0.9047619047619048

scala> jw("aa trump","trump")  // correct
 res15: Double = 0.875

scala> jw("aaa trump","trump")  // *INCORRECT*
 res16: Double = 0.0

scala> jw("hillary cliton","clinton")  // correct
 res8: Double = 0.30952380952380953

scala> jw("donald trump","trump")  // INCORRECT
 res9: Double = 0.0

  was:
JaroWinklerDistance resolves 0 similariy between "_trump_" and "_donald trump_"

scala exmaple here:

 scala> val jw=new JaroWinklerDistance

scala> jw("*trump*","*donald trump*")  // *INCORRECT*
 *res1: Double = 0.0*

scala> jw("ivanka trump","donald trump")  // correct
 res2: Double = 0.736111111111111

scala> jw(" trump","trump") // correct result; there's a leading space in first string
 res13: Double = 0.9444444444444445

scala> jw("a trump","trump")  // correct
 res14: Double = 0.9047619047619048

scala> jw("aa trump","trump")  // correct
 res15: Double = 0.875

scala> jw("aaa trump","trump")  // *INCORRECT*
 res16: Double = 0.0

scala> jw("hillary cliton","clinton")  // correct
 res8: Double = 0.30952380952380953

scala> jw("donald trump","trump")  // INCORRECT
 res9: Double = 0.0


> incorrect result from JaroWinklerDistance(计算不正确)
> ------------------------------------------------
>
>                 Key: TEXT-129
>                 URL: https://issues.apache.org/jira/browse/TEXT-129
>             Project: Commons Text
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: commons-lang3:3.7; HotSpot for Linux, 1.8;
>            Reporter: Jason Lee
>            Priority: Major
>         Attachments: Screenshot from 2018-07-31 13-04-24.png
>
>
> JaroWinklerDistance resolves 0 similariy between "_trump_" and "_donald trump_"
> scala code here:
>  scala> val jw=new JaroWinklerDistance
> scala> jw("*trump*","*donald trump*")  // *INCORRECT*
>  *res1: Double = 0.0*
> scala> jw("ivanka trump","donald trump")  // correct
>  res2: Double = 0.736111111111111
> scala> jw(" trump","trump") // correct result; there's a leading space in first string
>  res13: Double = 0.9444444444444445
> scala> jw("a trump","trump")  // correct
>  res14: Double = 0.9047619047619048
> scala> jw("aa trump","trump")  // correct
>  res15: Double = 0.875
> scala> jw("aaa trump","trump")  // *INCORRECT*
>  res16: Double = 0.0
> scala> jw("hillary cliton","clinton")  // correct
>  res8: Double = 0.30952380952380953
> scala> jw("donald trump","trump")  // INCORRECT
>  res9: Double = 0.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message