commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [commons-text] kinow commented on a change in pull request #111: TEXT-157: Remove rounding from JaccardSimilarity and Distance
Date Fri, 08 Mar 2019 12:19:49 GMT
kinow commented on a change in pull request #111: TEXT-157: Remove rounding from JaccardSimilarity
and Distance
URL: https://github.com/apache/commons-text/pull/111#discussion_r263759006
 
 

 ##########
 File path: src/test/java/org/apache/commons/text/similarity/JaccardDistanceTest.java
 ##########
 @@ -36,21 +36,23 @@ public static void setUp() {
 
     @Test
     public void testGettingJaccardDistance() {
-        assertEquals(1.00d, classBeingTested.apply("", ""), 0.00000000000000000001d);
-        assertEquals(1.00d, classBeingTested.apply("left", ""), 0.00000000000000000001d);
-        assertEquals(1.00d, classBeingTested.apply("", "right"), 0.00000000000000000001d);
-        assertEquals(0.25d, classBeingTested.apply("frog", "fog"), 0.00000000000000000001d);
-        assertEquals(1.00d, classBeingTested.apply("fly", "ant"), 0.00000000000000000001d);
-        assertEquals(0.78d, classBeingTested.apply("elephant", "hippo"), 0.00000000000000000001d);
-        assertEquals(0.36d, classBeingTested.apply("ABC Corporation", "ABC Corp"), 0.00000000000000000001d);
-        assertEquals(0.24d, classBeingTested.apply("D N H Enterprises Inc", "D & H Enterprises,
Inc."),
-                0.00000000000000000001d);
-        assertEquals(0.11d, classBeingTested.apply("My Gym Children's Fitness Center", "My
Gym. Childrens Fitness"),
-                0.00000000000000000001d);
-        assertEquals(0.10d, classBeingTested.apply("PENNSYLVANIA", "PENNCISYLVNIA"), 0.00000000000000000001d);
-        assertEquals(0.87d, classBeingTested.apply("left", "right"), 0.00000000000000000001d);
-        assertEquals(0.87d, classBeingTested.apply("leettteft", "ritttght"), 0.00000000000000000001d);
-        assertEquals(0.0d, classBeingTested.apply("the same string", "the same string"),
0.00000000000000000001d);
+        // Results generated using the python distance library using:
+        // distance.jaccard(seq1, seq2)
+        assertEquals(1.0, classBeingTested.apply("", ""));
+        assertEquals(1.0, classBeingTested.apply("left", ""));
+        assertEquals(1.0, classBeingTested.apply("", "right"));
+        assertEquals(0.25, classBeingTested.apply("frog", "fog"));
+        assertEquals(1.0, classBeingTested.apply("fly", "ant"));
+        assertEquals(0.7777777777777778, classBeingTested.apply("elephant", "hippo"));
 
 Review comment:
   >How about a comment in the test explaining where each value comes from, or even the
actual computation
   
   I will take your word here :-) let's leave it like this for now then. If it eventually
fails in a JVM, we can either add that episillon where appropriate, or the comment, or the
explicit calculation (liked this last one, never occurred me to test that way!).
   
   :+1: from me. And travis is happy too. Up to you to merge it now or wait for others to
review :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message