lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-3972) Improve AllGroupsCollector implementations
Date Thu, 12 Apr 2012 13:53:19 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252427#comment-13252427
] 

Dawid Weiss edited comment on LUCENE-3972 at 4/12/12 1:52 PM:
--------------------------------------------------------------

This is curious indeed. One thing to check would be this: SentinelIntSet uses no key rehashing
(rehash simply returns the key). This resulted in very poor performance for certain regular
integer sets (my experience from implementing HPPC). So while rehashing may seem like an additional
overhead, it actually boosts performance.

Martijn -- could you patch the trunk's SentinelIntSet#rehash with, for example, this (murmur
hash3 tail):
{noformat}
    public static int rehash(int k)
    {
        k ^= k >>> 16;
        k *= 0x85ebca6b;
        k ^= k >>> 13;
        k *= 0xc2b2ae35;
        k ^= k >>> 16;
        return k;
    }
{noformat}
and retry your test? Btw. I'm not saying it'll be faster :)
                
      was (Author: dweiss):
    This is curious indeed. One thing to check would be this: SentinelIntSet uses no key rehashing
(rehash simply returns the key). This resulted in very poor performance for certain regular
integer sets (my experience from implementing HPPC). So while rehashing may seem like an additional
overhead, it actually boosts performance.

Martijn -- could you patch the trunk's SentinelIntSet#rehash with, for example, this (murmur
hash3 tail):
{noformat}
    public static int rehash(int k)
    {
        k ^= k >>> 16;
        k *= 0x85ebca6b;
        k ^= k >>> 13;
        k *= 0xc2b2ae35;
        k ^= k >>> 16;
        return k;
    }
{noformat}
                  
> Improve AllGroupsCollector implementations
> ------------------------------------------
>
>                 Key: LUCENE-3972
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3972
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>         Attachments: LUCENE-3972.patch, LUCENE-3972.patch
>
>
> I think that the performance of TermAllGroupsCollectorm, DVAllGroupsCollector.BR and
DVAllGroupsCollector.SortedBR can be improved by using BytesRefHash to store the groups instead
of an ArrayList.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message