commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (JIRA)" <>
Subject [jira] [Resolved] (COLLECTIONS-429) Performance problem in MultiValueMap
Date Wed, 17 Apr 2013 19:37:18 GMT


Thomas Neidhart resolved COLLECTIONS-429.

       Resolution: Fixed
    Fix Version/s: 4.0

Changed the method in r1469039 to a version similar in the patch.

The reason I created a new method instead of applying the patch is as follows:

 * the outlined performance gain is based on an extreme and unlikely example, in a typical
use-case the object-creation may even outweigh the iteration.
 * it would be unusual for Collections classes to duplicate it elements by using any of the
methods of the Collections interface. This could have negative side-effects for unaware users,
thus adding the method in CollectionUtils with the described runtime/space trade-off.

Thanks for the patch anyway!
> Performance problem in MultiValueMap
> ------------------------------------
>                 Key: COLLECTIONS-429
>                 URL:
>             Project: Commons Collections
>          Issue Type: Improvement
>    Affects Versions: 3.2.1
>         Environment: java 1.6.0_24
> Ubuntu 11.10
>            Reporter: Adrian Nistor
>             Fix For: 4.0
>         Attachments: patchFull_AbstractHashedMap.diff, patchFull.diff, patchFull_StaticBucketMap.diff,
patchSmall_AbstractHashedMap.diff, patchSmall.diff, patchSmall_StaticBucketMap.diff,,,,
> Hi,
> I am encountering a performance problem in MultiValueMap.  It appears
> in version 3.2.1 and also in revision 1366088.  I attached a test that
> exposes this problem and a patch that fixes it.  On my machine, for
> this test, the patch provides a 1158X speedup.
> To run the test, just do:
> $ java Test
> The output for the un-patched version is:
> Time is 44040
> The output for the patched version is:
> Time is 38
> The attached test shows that, for a "MultiValueMap multi" object, the
> following operation is very slow:
> multi.values().containsAll(toContain);
> "MultiValueMap.values()" returns a "MultiValueMap.Values" object,
> which inherits containsAll() from "java.util.AbstractCollection",
> which has quadratic complexity.
> I attached two patches.  Both patches override containsAll() and
> implement a linear time algorithm.  patchSmall.diff populates a
> HashSet eagerly, and patchFull.diff populates a HashSet lazily.
> patchFull.diff is faster than patchSmall.diff when containsAll()
> returns false after inspecting only a few elements, though in most
> cases they are equally fast.  I am including patchSmall.diff just in
> case you prefer smaller code.
> Note that this problem is different from COLLECTIONS-416.  As
> established in the COLLECTIONS-416 discussion, there the user was
> responsible for using the proper data structures as argument for the
> method.
> For "MultiValueMap.values()", the problem is not related to the
> collection passed as parameter.  The problem will always manifest for
> this method, irrespective of the parameter type.  I attached a test
> ( that shows that, even with a HashSet
> parameter, the current problem still manifests (which did not happen
> for COLLECTIONS-416).
> This problem also exists for the two other "Values" classes
> (AbstractHashedMap.Values, StaticBucketMap.Values).  I attached tests
> and patches for these classes as well.  If you want me to file
> separate reports, just let me know.
> Is this truly a performance problem, or am I misunderstanding the
> intended behavior?  If so, can you please confirm that the patches are
> correct?
> Thanks,
> Adrian

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message