hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Rodionov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13761) Optimize FuzzyRowFilter
Date Mon, 25 May 2015 04:36:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557962#comment-14557962
] 

Vladimir Rodionov commented on HBASE-13761:
-------------------------------------------

Optimizations in this patch:

* new implementation of satisfies method using Unsafe access (8 bytes at a time)
* When there are more than one fuzzy keys, significant improvement in handling in getNextCellHint
* When there are more than one fuzzy keys, we keep track of a last matched fuzzy key and try
it next time first.

Performance:

YMMV, but in my tests (RegionScanner - not ResultScanner) I observed numbers between 7-10%
(for single fuzzy key) up to 100% (for 20 fuzzy keys). The more fuzzy keys in a filter - the
more performance gain is. 

This filter runs at the same speed regardless of a number of fuzzy search keys: 1, 20, 100
... 

> Optimize FuzzyRowFilter
> -----------------------
>
>                 Key: HBASE-13761
>                 URL: https://issues.apache.org/jira/browse/HBASE-13761
>             Project: HBase
>          Issue Type: Improvement
>          Components: Filters
>    Affects Versions: 2.0.0, 1.1.0, 0.98.13
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Minor
>             Fix For: 2.0.0, 0.98.14, 1.1.1
>
>         Attachments: HBASE-13761.patch
>
>
> FuzzyRowFilter has some room for improvements: a lot of byte-by-byte arithmetic, non-efficient
algorithm of selecting next candidate row etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message