hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-451) Remove HTableDescriptor from HRegionInfo
Date Wed, 06 Apr 2011 02:51:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016223#comment-13016223
] 

stack commented on HBASE-451:
-----------------------------

Warning.  This is a big one Subbu.

So, I used to think that storing the schema in zk was the way to go but Ryan argues, correctly
I believe, that zk should only carry transient data if only so we can copy hdfs content and
then we can bring up the data elsewhere under another cluster  (Otherwise, we'll have to copy
hdfs and zk state to replicate cluster data elsewhere -- a pain).  So, we could write schema
into a table or into hdfs.  We could write the table schema into a new catalog table named
schemas or, I believe it was Andrew Purtell who suggested, we put the schema into a new column
family in .META. table into the first region only.  If we wrote it into hdfs, we could write
it into a .tabledescriptor file as we write the .regioninfo file now under each region.  On
startup, I'd think that we'd read hdfs or a schema table and then per table add a znode up
in zk.  On each new table edit, we'd update the znode.  All regionservers would be watching
the zk table proxy and would know to reread the schema on watcher trigger.

On JSON serializing, yeah, that'd be sweet but might be a bit much to bite off as part of
this issue.  Maybe just go w/ Writables until its all running?  Open new issue to add serialization
of types to JSON (Todd mentions that if we avro'd this stuff, we could make use of an avro-to-json
gateway that apparently avro has).

> Remove HTableDescriptor from HRegionInfo
> ----------------------------------------
>
>                 Key: HBASE-451
>                 URL: https://issues.apache.org/jira/browse/HBASE-451
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Jim Kellerman
>            Priority: Critical
>             Fix For: 0.92.0
>
>
> There is an HRegionInfo for every region in HBase. Currently HRegionInfo also contains
the HTableDescriptor (the schema). That means we store the schema n times where n is the number
of regions in the table.
> Additionally, for every region of the same table that the region server has open, there
is a copy of the schema. Thus it is stored in memory once for each open region.
> If HRegionInfo merely contained the table name the HTableDescriptor could be stored in
a separate file and easily found.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message