hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: partitioning and map/reduce &hbase hashcodes
Date Sun, 19 Dec 2010 19:53:33 GMT
One of the key motivators for this strategy is to allow range queries to be
fast.

On Sun, Dec 19, 2010 at 11:33 AM, Jonathan Gray <jgray@fb.com> wrote:

> HBase doesn't hashcode anything.  It does strict lexicographical ordering
> of the row keys themselves.  So yes, keys with similar prefixes may be in
> the same partition / next to each other.
>
> Rather than using a hashcode modulo some number, we use the META table to
> determine which partition (region) your key is in and also which node
> (regionserver) is hosting it right now.  Each of our shards is a range of
> rows: [start,stop) rather than a true hash table.
>
> > -----Original Message-----
> > From: Hiller, Dean (Contractor) [mailto:dean.hiller@broadridge.com]
> > Sent: Sunday, December 19, 2010 10:33 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: partitioning and map/reduce &hbase hashcodes
> >
> > We happen to be looking at gigaspaces and hbase/hadoop.  I read this in
> the
> > gigaspaces documentation...
> >
> >
> >
> > Target partition space ID = hashcode % (# of partitions)
> >
> >
> >
> > Is it me or isn't that bad unless you write a special String hashcode
> that not
> > only hashcodes it but makes sure the Strings hashcode stays near
> > alphabetical hashcode such that com.google.maps, and com.google.code
> > stay relatively local.
> >
> >
> >
> > I mean, if I have int's for account numbers where if account numbers are
> > close together, then they are more related, that formula would split my
> > account numbers across the cluster, correct?  The above formula would
> > make account 3 ,4,5,6 far from each other rather than on the same node.
> >
> >
> >
> > How does hbase work here with keys and such?  I assume it is much like
> > bigtable in that com.google.maps is stored near com.google.code since it
> is
> > an ordered map, but how is that implemented(hashcode rewritten or just
> > using string somehow?)
> >
> >
> >
> > Thanks,
> >
> > Dean
> >
> >
> >
> >
> >
> >
> > This message and any attachments are intended only for the use of the
> > addressee and may contain information that is privileged and
> confidential. If
> > the reader of the message is not the intended recipient or an authorized
> > representative of the intended recipient, you are hereby notified that
> any
> > dissemination of this communication is strictly prohibited. If you have
> > received this communication in error, please notify us immediately by
> e-mail
> > and delete the message and any attachments from your system.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message