hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-2346) Usage of FilterList slows down scans
Date Fri, 09 Apr 2010 00:12:37 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Daniel Cryans updated HBASE-2346:
--------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed to trunk, branch and pre-durab.

> Usage of FilterList slows down scans
> ------------------------------------
>
>                 Key: HBASE-2346
>                 URL: https://issues.apache.org/jira/browse/HBASE-2346
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: filters
>    Affects Versions: 0.20.3
>         Environment: Ubuntu 9.10
> Sund JDK 1.6.0_16
> HBase 0.20.3 (standalone and pseudo-distributed)
>            Reporter: Stefan Seelmann
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2346.patch, ScanFilterTest.java
>
>
> When using a FilterList the scan is much slower compared to a scan with only a single
filter (tested SingleColumnValueFilter and PrefixFilter).
> The difference is extrem for very small ranges: if the range is only 10 rows the scan
is 10 times slower when using the FilterList.
> Is the cause just GC or object serialization/deserialization?
> For a simple test I used the PerformanceEvaluation tool and created the TestTable with
only 10(!) rows:
> $ bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=10 sequentialWrite 1
> The attached test performs 100 scan using diffent filters. The filter is then wrapped
into a FilterList (with only a single filter). This FilterList is then nested two more times
into another FilterList. For each nested level the scan gets slower and slower.
> The test created the following output:
> Scan Null Filter (10): 391ms
> Scan FilterList with Null Filter (0): 4788ms
> Scan Nested FilterList with Null Filter (0): 8303ms
> Scan Nested Nested FilterList with Null Filter (0): 11915ms
> Scan SingleColumValueFilter Equal (0): 257ms
> Scan FilterList with SingleColumValueFilter Equal (0): 4121ms
> Scan Nested FilterList with SingleColumValueFilter Equal (0): 7965ms
> Scan Nested Nested FilterList with SingleColumValueFilter Equal (0): 11600ms
> Scan SingleColumValueFilter Not Equal (10): 912ms
> Scan FilterList with SingleColumValueFilter Not Equal (10): 4542ms
> Scan Nested FilterList with SingleColumValueFilter Not Equal (10): 8459ms
> Scan Nested Nested FilterList with SingleColumValueFilter Not Equal (10): 11513ms
> Scan PrefixFilter (10): 306ms
> Scan FilterList with PrefixFilter (10): 3695ms
> Scan Nested FilterList with PrefixFilter (10): 7762ms
> Scan Nested Nested FilterList with PrefixFilter (10): 11721ms
> Get: 245ms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message