hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Holstad (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HBASE-80) [hbase] Add a cache of 'hot' cells
Date Wed, 21 Jan 2009 22:24:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665964#action_12665964
] 

erikholstad@gmail.com edited comment on HBASE-80 at 1/21/09 2:23 PM:
------------------------------------------------------------

Sorry for not posting on this issue, even thought I have been assigned and everything :)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.

Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare
and 
test have currently only been done on a single machine cluster with one HRegionServer. That
setup
includes 1column/family and every value is 1000B.

Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
                                                        481 KB/s
Random reads with cache: 4019 r/s
                                                  4019 KB/s


Some other test to compare the difference when using multiple columns/family turned out to
give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
                                                        2223 KB/s
Random reads with cache: 3588 r/s
                                                  17940 KB/s

10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
                                                        240 KB/s
Random reads with cache: 25 r/s
                                                  250 KB/s 

For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
                                                        284 KB/s
Random reads with cache: 2083 r/s
                                                  2083 KB/s

Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
                                                        23000 KB/s
Random reads with cache: 76 r/s
                                                  76000 KB/s

      was (Author: erikholstad@gmail.com):
    Sorry for not posting on this issue, even thought I have been assigned and everything
:)
So the basic idea that I have been working on is to make a key/value cache to speed up random
reads.

Test setup:
Used the same test parameters that are used in the BT paper so it would be easy to compare
and 
test have currently only been done on a single machine cluster with one HRegionServer. That
setup
includes 1column/family and every value is 1000B.

Some numbers for testing this extremely simple cache are:
Tests done over 10000 reads
Random reads without cache: 481 r/s
                                                        481 KB/s
Random reads with cache: 4019 r/s
                                                  4019 KB/s


Some other test to compare the difference when using multiple columns/family turned out to
give the
following numbers:
5 columns/family everything else the same as above.
Random reads without cache: 445 r/s
                                                        2223 KB/s
Random reads with cache: 3588 r/s
                                                  17940 KB/s

10 columns/family everything else the same as above.
Random reads without cache: 24 r/s
                                                        24000 KB/s
Random reads with cache: 25 r/s
                                                  25000 KB/s 

For the rest of the test only 100 rows where used to avoid out of memory errors.
Like first test but fewer rows:
Random reads without cache: 284 r/s
                                                        284 KB/s
Random reads with cache: 2083 r/s
                                                  2083 KB/s

Same as above but with 1000 columns/family
Random reads without cache: 23 r/s
                                                        23000 KB/s
Random reads with cache: 76 r/s
                                                  76000 KB/s
  
> [hbase] Add a cache of 'hot' cells
> ----------------------------------
>
>                 Key: HBASE-80
>                 URL: https://issues.apache.org/jira/browse/HBASE-80
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Erik Holstad
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: cache.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message