commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-839) ArrayUtils removeElements methods use unnecessary HashSet
Date Tue, 09 Oct 2012 01:20:03 GMT

    [ https://issues.apache.org/jira/browse/LANG-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472034#comment-13472034
] 

Sebb commented on LANG-839:
---------------------------

BitSet is considerably faster on Win XP:

Ratio=38% count=0 hash=610134 bits=234388
Ratio=46% count=5 hash=1809448 bits=837536
Ratio=54% count=10 hash=2840584 bits=1536229
Ratio=38% count=200 hash=72772936 bits=28216994
Ratio=37% count=50 hash=17909539 bits=6729347
Ratio=39% count=100 hash=35617096 bits=13972166
Ratio=40% count=1000 hash=339097567 bits=138882176
Ratio=42% count=2000 hash=650113632 bits=278152949

and Continuum:

Ratio=10% count=0 hash=1164164 bits=126956
Ratio=15% count=5 hash=1433866 bits=228518
Ratio=18% count=10 hash=1911315 bits=355922
Ratio=17% count=200 hash=31370106 bits=5439748
Ratio=18% count=50 hash=6947508 bits=1271146
Ratio=18% count=100 hash=13671526 bits=2555063
Ratio=15% count=1000 hash=154243712 bits=24577725
Ratio=10% count=2000 hash=411835139 bits=43056221
                
> ArrayUtils removeElements methods use unnecessary HashSet
> ---------------------------------------------------------
>
>                 Key: LANG-839
>                 URL: https://issues.apache.org/jira/browse/LANG-839
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.*
>    Affects Versions: 3.1
>            Reporter: Sebb
>            Priority: Minor
>         Attachments: LANG-839.patch
>
>
> The removeElements() methods use a HashSet to collect the indexes that need removing.
> This requires creating Integer objects for each index, and the HashSet then has to be
converted into an int[] array.
> It would be more efficient to store the entries in an actual int[] array.
> The maximum size of this is the length of the values array (or the length of the input
array if that is shorter).
> The array must be truncated before calling the private removeAll() method; this can be
done with Arrays.copyOf(x[], length).
> However, if the arrays are very large, and most of the values do not appear in the input,
this might result in using more memory than the HashSet implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message