commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrian Nistor (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (COLLECTIONS-408) performance problem in SetUniqueList.removeAll()
Date Sat, 02 Jun 2012 06:57:23 GMT

     [ https://issues.apache.org/jira/browse/COLLECTIONS-408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrian Nistor updated COLLECTIONS-408:
--------------------------------------

    Attachment: Test.java
                patch.diff
    
> performance problem in SetUniqueList.removeAll()
> ------------------------------------------------
>
>                 Key: COLLECTIONS-408
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-408
>             Project: Commons Collections
>          Issue Type: Bug
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in SetUniqueList.removeAll().
> It appears in version 3.2.1 and also in revision 1344775 (31 May
> 2012).  I have attached a test that exposes this problem and a
> one-line patch that fixes it.  The patch makes the code two times
> faster for this test.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is: 5027
> The output for the patched version is:
> Time is: 2554
> The one-line patch I attached changes the 
> SetUniqueList.removeAll(Collection<?> coll) code from:
> boolean result = super.removeAll(coll);
> set.removeAll(coll);
> return result;
> to:
> boolean result = super.removeAll(coll);
> if (result) set.removeAll(coll);
> return result;
> If "super.removeAll(coll)" did not change the collection, there is no
> need to call "set.removeAll(coll)", because we already know there is
> nothing to remove.
> As one may expect "set.removeAll(coll)" (on a set) to be faster than
> "super.removeAll(coll)" (on a list), one may have expected the speedup
> gained by avoiding "set.removeAll(coll)" to be smaller than 2X
> achieved for the attached test.  However, the speedup is 2X because
> "java.util.HashSet.removeAll(Collection<?> collection)" has quadratic
> (not linear) complexity if "this.size() <= collection.size()" and the
> "collection" is a list.  Thus, "set.removeAll(coll)" is about as slow
> as "super.removeAll(coll)" in this case, and not executing
> "set.removeAll(coll)" reduces the work done by half.  The quadratic
> behavior of "java.util.HashSet.removeAll(Collection<?> collection)"
> comes from "java.util.AbstractSet.removeAll(Collection<?> c)" and is
> discussed for example here:
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2011-July/007148.html
> (The link is for OpenJDK, but Oracle JDK has the same problem.)
> In many other cases "set.removeAll(coll)" is actually faster than
> "super.removeAll(coll)", so one can get even more speedup by
> reordering those two checks:
> boolean result = set.removeAll(coll);
> if (result) super.removeAll(coll);
> return result;
> Is this a bug, or am I misunderstanding the intended behavior?  If so,
> can you please confirm that the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message