commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrian Nistor (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (COLLECTIONS-426) performance problem in ListOrderedSet.retainAll()
Date Tue, 24 Jul 2012 22:05:34 GMT

     [ https://issues.apache.org/jira/browse/COLLECTIONS-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrian Nistor updated COLLECTIONS-426:
--------------------------------------

    Attachment: Test.java
                patch.diff
    
> performance problem in ListOrderedSet.retainAll()
> -------------------------------------------------
>
>                 Key: COLLECTIONS-426
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-426
>             Project: Commons Collections
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>         Attachments: Test.java, patch.diff
>
>
> Hi,
> I am encountering a performance problem in ListOrderedSet.retainAll().
> It appears in version 3.2.1 and also in revision 1365132.  I attached
> a test that exposes this problem and a patch that fixes it.  On my
> machine, for this test, the patch provides a 263X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 3682
> The output for the patched version is:
> Time is 14
> The problem is that the code for method
> "ListOrderedSet.retainAll(Collection<?> coll)" is:
> {code:java|borderStyle=solid}
> public boolean retainAll(Collection<?> coll) {
>     boolean result = collection.retainAll(coll);
> ....
> }
> {code}
> because "collection.retainAll(coll)" has quadratic complexity if
> parameter "coll" is a List.  Conceptually, the solution is to call
> "coll.retainAll(collection)" which has linear complexity (not
> quadratic), because "collection" is always a HashSet (we know this,
> because "collection" is an inherited field of ListOrderedSet, and thus
> cannot have another type).  See the
> "java.util.AbstractCollection.retainAll()" code inlined below for why
> one call has quadratic complexity, and the other has linear
> complexity.
> Directly calling "coll.retainAll(collection)" would change the
> behavior of ListOrderedSet.retainAll(), which is why the patch seems a
> bit involved.  In reality, the patch simulates the behavior of
> "coll.retainAll(collection)" (which is just the intersection of two
> collections).  You can easily see this by looking at the patch and at
> the "java.util.AbstractCollection.retainAll()" code inlined below.
> "collection.retainAll(coll)" is "java.util.HashSet.retainAll()", which
> is "java.util.AbstractCollection.retainAll()", which has the code:
> {code:java|borderStyle=solid}
> public boolean retainAll(Collection<?> c) {
>     boolean modified = false;
>     Iterator<E> e = iterator();
>     while (e.hasNext()) {
>         if (!c.contains(e.next())) {
>             e.remove();
>             modified = true;
>         }
>     }
>     return modified;
> }
> {code} 
> Is this a bug, or am I misunderstanding the intended behavior?  If so,
> can you please confirm if the patch is correct?
> Thanks,
> Adrian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message