accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4468) accumulo.core.data.Key.equals(Key, PartialKey) improvement
Date Wed, 21 Sep 2016 22:36:21 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511418#comment-15511418
] 

Keith Turner commented on ACCUMULO-4468:
----------------------------------------

[~wmurnane] I like the change.  I wrote the comment and made the changes in Key that you referenced.
 I remember doing the performance testing for that.  When I first experimented with the change,
I tried comparing the byte arrays in reveres order.  That was much slower than comparing forward.
So I avoiding that and found another strategy that worked well.  I think the concept is sound,
but the performance testing is definitely needed to make sure it works as intended.

I also like the switch statement fall through, I think its slick.  If it doesn't exists, It
would be nice to add a unit test that checks for correctness for keys that only differ by
one field.  Basically ensure that each of the key field comparisons is tested.

> accumulo.core.data.Key.equals(Key, PartialKey) improvement
> ----------------------------------------------------------
>
>                 Key: ACCUMULO-4468
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4468
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.8.0
>            Reporter: Will Murnane
>            Priority: Trivial
>              Labels: newbie, performance
>         Attachments: benchmark.tar.gz, key_comparison.patch
>
>
> In the Key.equals(Key, PartialKey) overload, the current method compares starting at
the beginning of the key, and works its way toward the end. This functions correctly, of course,
but one of the typical uses of this method is to compare adjacent rows to break them into
larger chunks. For example, accumulo.core.iterators.Combiner repeatedly calls this method
with subsequent pairs of keys.
> I have a patch which reverses the comparison order. That is, if the method is called
with ROW_COLFAM_COLQUAL_COLVIS, it will compare visibility, cq, cf, and finally row. This
(marginally) improves the speed of comparisons in the relatively common case where only the
last part is changing, with less complex code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message