hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Corgan <mcor...@hotpads.com>
Subject Re: how to randomize the primary key which is a timestamp
Date Mon, 10 Jan 2011 16:41:50 GMT
You can put them all in the same table.  If you prefix the keys when
written, use a prefix filter when querying.  I would choose a prefix window
that's about 4 times the number of nodes.


On Mon, Jan 10, 2011 at 11:30 AM, Ted Dunning <tdunning@maprtech.com> wrote:

> If multiple tables have the same key distribution and count, then they will
> have similar split points for their regions, but the locations of the
> regions will be randomized.
>
> I wouldn't worry about this until you see evidence it is a problem.
>
> On Mon, Jan 10, 2011 at 8:20 AM, Weishung Chung <weishung@gmail.com>
> wrote:
>
> > Thank you for the replies.
> > Most of the queries, (70%) will be for scanning a range of consecutive
> > times, with some single timestamp query (30%)
> > But there are multiple tables with the same range of timestamps, will all
> > these same range of timestamps from multiple tables be stored on the same
> > region server and if so, could it affect the performance of map reduce
> jobs
> > (operated on those tables with the same range of time periods) ? Would
> > hotspotting defeat the purpose of map reduce?
> >
> > On Mon, Jan 10, 2011 at 10:08 AM, Matt Corgan <mcorgan@hotpads.com>
> wrote:
> >
> > > You can also add a random (or hashed) prefix to the beginning of the
> key.
> > >  If your prefix were one byte with values 0-63, you've divided the hot
> > spot
> > > into 64 smaller ones, which is better for writing.  The downside is
> that
> > if
> > > you want to read a range of values, you will have to query all 64
> > "shards"
> > > and merge the sorted values.  You can choose whatever prefix size is
> best
> > > for your scenario.
> > >
> > >
> > > On Mon, Jan 10, 2011 at 11:05 AM, Chirstopher Tarnas <cft@email.com>
> > > wrote:
> > >
> > > > Some options that I am aware of:
> > > >
> > > > reverse the byte order of the timestamp
> > > > use UUIDs rather than a timestamp
> > > > use hashing, this working really depends on your requirements
> > > >
> > > > On Mon, Jan 10, 2011 at 9:33 AM, Weishung Chung <weishung@gmail.com>
> > > > wrote:
> > > >
> > > > > What is the good way to randomize the primary key which is a
> > timestamp
> > > in
> > > > > HBase to avoid hotspotting?
> > > > > Thank you so much :)
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message