Hi there

Lucene calucaltes the string si= milarity between two strings s1 and s2 according to the formula<= /span>

Similarity =3D Levenshtein-Dist= ance(s1,s2)/min(Length(s1),Length(s2))

I would have thought Lucene wou= ld divide by the length of the longer string. In particular, the above form= ula could – in my understanding – lead to a negative similarity= , since the Levenshtein distance can be as long as the length of the longer string.

Why does Lucene calculate the s= imilarity in this way?

Cheers,

Damian

--_000_DCE2441AD39CC844A71697D1893E6C352D89845FBSIP9550bsiaglo_-- ------AD3CC9ED0900AD2965D57C279D92C888 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" MIIIsQYJKoZIhvcNAQcCoIIIojCCCJ4CAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3 DQEHAaCCBg4wggYKMIIE8qADAgECAhAAz5HQOHbBxCjCFQZLKjVzMA0GCSqGSIb3 DQEBBQUAMFUxCzAJBgNVBAYTAkNIMRUwEwYDVQQKEwxTd2lzc1NpZ24gQUcxLzAt BgNVBAMTJlN3aXNzU2lnbiBQZXJzb25hbCBTaWx2ZXIgQ0EgMjAwOCAtIEcyMB4X DTEyMDgyNzE0MjYxNloXDTEzMDgyNzE0MjYxNlowVjEqMCgGA1UEAxMhU2VjdXJl IE1haWw6IFNFUFBtYWlsIENlcnRpZmljYXRlMSgwJgYJKoZIhvcNAQkBFhlkYW1p YW4uYmlyY2hsZXJAYnNpYWcuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB CgKCAQEAzT8iPhxcCzNwZsl2zLcoOPNu9CyiOMf4SSYaJYRajd55LhLFjD7QOdZJ fcbgccYRZLh+Rm1FXSgyhYZIf6MCWoWO9fC1Nd4S6vPfyWmvQHoSF2v7wBkQW1w6 lsbTdcT8pJUP/W9LIFpqBwOt9iHKD2JdoFzCKhorgOk1cLSFMvJY8U+vaJNt3IZQ kuSSWR+DEA9pUlCMOz9qMmSc3vdYQquc8B2p57/Guu5gWN0Rt10og5jNNBrUHGBV V5ot6RGtQj6sQCmK8aFMckqBc/maNeXuKrlIxoWmRR8iKMOI7gzPW+cnV7SzsLQy PzRWSj1DIvWrYayxzgo6akd1kGgyowIDAQABo4IC0zCCAs8wJAYDVR0RBB0wG4EZ ZGFtaWFuLmJpcmNobGVyQGJzaWFnLmNvbTAOBgNVHQ8BAf8EBAMCBLAwEwYDVR0l BAwwCgYIKwYBBQUHAwQwHQYDVR0OBBYEFAz6SJZZVtpo3P6wgK/Z2acQxNbUMB8G A1UdIwQYMBaAFOs1sVZtFWBY9OEizRxGHK7QBABlMIH/BgNVHR8EgfcwgfQwR6BF oEOGQWh0dHA6Ly9jcmwuc3dpc3NzaWduLm5ldC9FQjM1QjE1NjZEMTU2MDU4RjRF MTIyQ0QxQzQ2MUNBRUQwMDQwMDY1MIGooIGloIGihoGfbGRhcDovL2RpcmVjdG9y eS5zd2lzc3NpZ24ubmV0L0NOPUVCMzVCMTU2NkQxNTYwNThGNEUxMjJDRDFDNDYx Q0FFRDAwNDAwNjUlMkNPPVN3aXNzU2lnbiUyQ0M9Q0g/Y2VydGlmaWNhdGVSZXZv Y2F0aW9uTGlzdD9iYXNlP29iamVjdENsYXNzPWNSTERpc3RyaWJ1dGlvblBvaW50 MGQGA1UdIARdMFswWQYJYIV0AVkBAwEEMEwwSgYIKwYBBQUHAgEWPmh0dHA6Ly9y ZXBvc2l0b3J5LnN3aXNzc2lnbi5jb20vU3dpc3NTaWduLVNpbHZlci1DUC1DUFMt UjQucGRmMIHZBggrBgEFBQcBAQSBzDCByTBkBggrBgEFBQcwAoZYaHR0cDovL3N3 aXNzc2lnbi5uZXQvY2dpLWJpbi9hdXRob3JpdHkvZG93bmxvYWQvRUIzNUIxNTY2 RDE1NjA1OEY0RTEyMkNEMUM0NjFDQUVEMDA0MDA2NTBhBggrBgEFBQcwAYZVaHR0 cDovL3NpbHZlci1wZXJzb25hbC1nMi5vY3NwLnN3aXNzc2lnbi5uZXQvRUIzNUIx NTY2RDE1NjA1OEY0RTEyMkNEMUM0NjFDQUVEMDA0MDA2NTANBgkqhkiG9w0BAQUF AAOCAQEAZ0MGkdgklsqIGz+IDYrscy0yWH3T3eWARmNRqDACkKzV4aXB38BZS06E RVwWb58YC+zWQPSUJSrwUxfYJ5Q12r/N7reJ3YaauK1Gi/aGKdNzcMNKUCh3u7UH Yb8FAreve5SqucQNLaE5xtQ/j5lYPIrcPnRTdzxOj99htZPRJNQ4d2zGMg8DQXvt gyTuKkbRMjePmUupU+mgVnimtyVxLZhpirZRHiV1cBRZLM+DCyNKWGrelWGcCHlw 5+sQTBW1vcdjyeeufD5lxzafbSVNcVCxe/RxskH/c76fmgMwcfHoJxHQRuzlDR69 dkMqxPEk5Q3leVZwhinah2RaoUpoDzGCAmswggJnAgEBMGkwVTELMAkGA1UEBhMC Q0gxFTATBgNVBAoTDFN3aXNzU2lnbiBBRzEvMC0GA1UEAxMmU3dpc3NTaWduIFBl cnNvbmFsIFNpbHZlciBDQSAyMDA4IC0gRzICEADPkdA4dsHEKMIVBksqNXMwCQYF Kw4DAhoFAKCB2DAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJ BTEPFw0xMjExMDUxNjAwNTBaMCMGCSqGSIb3DQEJBDEWBBTs7hd9ltEZKwO61CAX farQAfuY4DB5BgkqhkiG9w0BCQ8xbDBqMAsGCWCGSAFlAwQBKjALBglghkgBZQME ARYwCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMCAgIAgDANBggq hkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkqhkiG9w0BAQEF AASCAQBjfA63b+jxNu285UPPxrfBhq1julM0yz2v8fEJZoSo8snAMcFXstmVceAh Cd/KLQEdoTC1XgeTb0mA2ExC7wHKQUsqnhqRacEtZHXATzi1n318T4Ew4rF7Fq2w SnDEFzY+KReQ5BqMbddred1DQ1g123mpT4kB3DwzVfBpskR3mp6K69JDhD6+uWcu +lSnSA3A6JYWiKY/TXNO0i8smPQ0mKWvXbL/n2DxTqkyISP2sSwJJvppZAZNjip0 1Q49hfc3nVJWVmIczsLkmkyg3/9VDL7zLz5h0JCTMthrD3pZFFa04bqmk1GwBJsD 7A65D/DPx6edV0Qt/ibtNeh6HZw3 ------AD3CC9ED0900AD2965D57C279D92C888--