hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Kellerman <...@powerset.com>
Subject RE: Proposal: Make Rows and Columns byte arrays rather than Text (HBASE-82)
Date Sat, 26 Apr 2008 17:11:03 GMT
> -----Original Message-----
> From: stack [mailto:stack@duboce.net]
> Sent: Friday, April 25, 2008 11:54 AM
> To: hbase-dev@hadoop.apache.org
> Subject: Proposal: Make Rows and Columns byte arrays rather
> than Text (HBASE-82)


> Downsides:
> + If comparator needs to do more than byte compare, then needs to
> instantiate two classes for every compare (Not the case for
> UTF-8 IIRC).

Text instantiates an inner class for its comparator.

> + Massive migration headache -- rewriting of every file (Need
> versioning
> of files stored in hbase).
> Issues:
> + How do we add new comparators to CLASSPATH on a running cluster?  We
> have this problem regards filters also so should come up with
> a general solution (As Kevin Beyer has noted).  Excepting
> restart -- not a soln.

Not sure this is necessary if we store the comparator with the schema and since the schema
is slated to be removed from HRegionInfo and stored in a 'well known' location, the comparator
can be read in with the schema, a once only operation per table.

> + Regards column names as byte arrays, in the Bigtable paper, it says
> column family names need to be 'printable'.  We should have
> same requirement.  Presume the column family preface is UTF-8
> encoded.  It will make it so we can find the family/qualifier
> ':' delimiter in the column name byte array.

Do we need byte[] qualifiers? Perhaps Kevin can chime in here.

> + Would further customize RPC so two types of invocation: data or
> message.  There would be 'RegionServer RPC Server' that was
> hardwired into regionserver going direct to regionserver
> methods rather than via reflection.
> + While get and batch update can be made to use byte arrays, scanners
> are a little awkward.  Scanner setup would be message-type
> RPC call but the next'ing calls against the scanner would be
> data-type RPC calls.

I think changing the RPC should be a separate issue. This will be a big enough change as it

No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 269.23.5/1399 - Release Date: 4/26/2008 2:17 PM

View raw message