hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: questions about overloading row key comparator method and some storage issues
Date Thu, 10 Apr 2008 16:58:04 GMT
Rosemond Wu wrote:
> Hello,
> I have an application in which the data is sparse, is seldom deleted, and
> requires extensible table schema. It looks to me hbase is a good fit. I have
> several question:
>
> * The row keys in my application require their own comparator. Where is a
> good place to overload the key's comparator method? It looks to me that
> hbase uses TreeMap in HStore.java to sort and save the tuples. Is this the
> place I should start?
>   

Keys currently are hardcoded as type Text.  HBASE-82 is about making 
keys be bytes with user supplying a Comparator.   Its on our near-term 
list of things to do.

> * How does hbase keep the order of rows? Does it use something similar to
> SSTable described in Google bigtable paper? If my application inserts many
> rows with key values that need to be sorted with rows already stored in
> disk, will it result in lots of index reconstruction and tablet split?
>   
Rows are lexicographically sorted in HBase (See Text.compareTo).

HBase works like bigtable; inserts go into memory first.  Memory is 
flushed when limits are reached.  The flushed files are compacted when 
they hit a limit.  In memory and on disk, edits are sorted.

If I understand the question,  whether inserts are sorted or not, the 
same amount of relative works is done.  Just the character of the upload 
as realized in the server will be different with unsorted inserts 
requiring the server to juggle more resources concurrently.

> BTW, why couldn't I access the archives of the old mailing list? When I hit
> the link http://hadoop.apache.org/mail/hbase-dev/, I got the following error
> message. The same problem happends to other mailist list as well.
>
>   
>  Forbidden
>   
Thanks for pointing out the broken link (looks like its broken for 
hadoop core too).  Let me try and fix.

St.Ack

Mime
View raw message