commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oswaldo Olivo (JIRA)" <>
Subject [jira] [Created] (COLLECTIONS-534) Performance bug in CollectionBag::retainAll
Date Wed, 15 Oct 2014 21:01:34 GMT
Oswaldo Olivo created COLLECTIONS-534:

             Summary: Performance bug in CollectionBag::retainAll
                 Key: COLLECTIONS-534
             Project: Commons Collections
          Issue Type: Bug
          Components: Bag, Collection
    Affects Versions: 4.0
         Environment: Ubuntu 12.04
            Reporter: Oswaldo Olivo

There seems to be a performance bug in the method retainAll in the CollectionBag class.
The problem is that the code is checking containment over the parameter collection, which
could be expensive for some types of collections like ArrayLists.
One solution could be to convert the Collection into a HashSet and check containment in the
 If this is done, then running retainAll on a 1,000,000 collection would take less than 2
seconds instead of 27 mins, according to my experiments.

Here's a function to expose the bug:

 public static void collectionBagRetainAllTest() {

	ArrayList<Integer> coll=new ArrayList<Integer>();
	for(int i=0;i<=1000000;++i)
	    coll.add(new Integer(i));

	TreeBag<Integer> treeBag=new TreeBag<Integer>(coll);

	CollectionBag<Integer> bag = new CollectionBag<Integer>(treeBag);


Here's a proposed patch:

  public boolean retainAll(final Collection<?> coll) {
        if (coll != null) {
            boolean modified = false;
            final Iterator<E> e = iterator();
	    HashSet<Object> set=new HashSet<Object>(coll);
            while (e.hasNext()) {
                if (!set.contains( {
                    modified = true;
            return modified;
        } else {
            // let the decorated bag handle the case of null argument
            return decorated().retainAll(null);

This message was sent by Atlassian JIRA

View raw message