hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-700) hbase.io.index.interval need be configuratable in column family
Date Tue, 24 Jun 2008 06:33:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607490#action_12607490

Andrew Purtell commented on HBASE-700:

Hi Stack.

HBASE-62 is a generalization of the changes to HTableDescriptor under consideration. So actually
it makes sense to implement this and then have the regionservers watch for certain known key-value
pairs that specify table or column store parameters. (Commenting on this issue specifically
I don't think JSON is warranted as an encoding for table and column metadata. Simple single
value strings for keys and values should be enough. However user metadata could be formatted
however the user desires.)

HBASE-34 could be specified at the table level. Regarding this issue I read Bryan's concerns
about exposing tuning parameters but suggest that people who are tuning parameters exposed
as proposed should know what they are doing, and, if not, will soon learn better. 

HBASE-43 seems pretty trivial, in a sense: In order to apply HTableDescriptor updates, the
client would need to disable the table, tell the master to update the descriptor, and then
reenable the table. So all of the pending edits would be flushed for the disable. Then, when
reenabled, the regionservers could note the read only attribute and simply reject edits to
the columns, and then both they and whatever mapreduce job running over the mapfiles could
coexist happily.

I think all of this could be rolled into one change set. Want to tie these all to HBASE-42,
or open a new JIRA? 

> hbase.io.index.interval need be configuratable in column family 
> ----------------------------------------------------------------
>                 Key: HBASE-700
>                 URL: https://issues.apache.org/jira/browse/HBASE-700
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.1.2
>            Reporter: LN
>            Priority: Minor
> setting parameter hbase.io.index.interval to smaller can improve hbase reading performance
significantly, esp. in large value size column families. however, small hbase.io.index.interval
cause more memory usage, because all index will read into memory when loading a mapfile.
> in my test env, i set hbase.io.index.interval to 1, after inserting about 3M samll size
records to a table(about 1.5G in hadoop file), the regionserver throws OOME.  then i found
total size of  map file index  is 350M.  however, i can't adjust  hbase.io.index.interval
to a larger one, like 32, because other big cell size tables need it be 1.
> so, i think make hbase.io.index.interval a column family property should be very important
for performance tuning. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message