hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10801) Ensure DBE interfaces can work with Cell
Date Fri, 18 Apr 2014 06:44:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973831#comment-13973831
] 

ramkrishna.s.vasudevan commented on HBASE-10801:
------------------------------------------------

I tested this patch with a minor modification of not passing the SeekerState to the KeyOnlyClonedSeekerState
to have only the primitive member variables.  (passing seekerstate was bit more costly).
Combining this with HBASE-10929  and added a filter FilterAllFilter, that filters out every
row that gets returned to the client.  This ensures that the path of the scan there is no
need for creating a KV object (which involves copying the value part also).  So purely the
comparison happens as only Cells.  Note that in this patch the key part is copied in the shallowCopy().
Doing so with a full table scan with 1 thread over 2000000 rows resulted in this 
With patch
========
{code}
hbase(main):002:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 9.6820 seconds

hbase(main):003:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.8490 seconds

hbase(main):004:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.7680 seconds

hbase(main):005:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.5470 seconds
{code}

without patch
=========
{code}
hbase(main):002:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 19.4020 seconds

hbase(main):003:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 6.1450 seconds

hbase(main):004:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.8520 seconds

hbase(main):005:0> scan 'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW                                              COLUMN+CELL
0 row(s) in 2.6900 seconds
{code}
Used Performance Evaluation tool.  So the length of value bytes is 1000 per row.  So you could
see when the experiment starts the scan almost takes 50% more time.  But once the cache is
fully loaded the scans are not too costly and the values even out with a small deviation.
Changing the value size may impact much more than this.
Can test with changing the value also and making it much more bigger.
This change in the performance during the first scanning remains consistent.

> Ensure DBE interfaces can work with Cell
> ----------------------------------------
>
>                 Key: HBASE-10801
>                 URL: https://issues.apache.org/jira/browse/HBASE-10801
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.99.0
>
>         Attachments: HBASE-10801.patch, HBASE-10801_1.patch, HBASE-10801_2.patch, HBASE-10801_3.patch
>
>
> Some changes to the interfaces may be needed for DBEs or may be the way it works currently
may be need to be modified inorder to make DBEs work with Cells. Suggestions and ideas welcome.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message