hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Presplit regions when creating a table
Date Thu, 05 Jul 2012 14:28:59 GMT
No, you need to know your key ranges for each split. If you don't and you guess wrong, you
may end up not seeing any benefits because your data may still end up going to a single region...
(Its data dependent.) 

I am personally not a fan of pre-splitting a table. 

The way I look at it, you only really have to deal with this when you first create a table.
However, once your application is in a steady state, the tables will split naturally and you
should have enough regions to get decent performance. 

Of course YMMV...


On Jul 5, 2012, at 4:16 AM, Christian Schäfer wrote:

> Hi,
> 
> I didn't hear about the possibility to split by regex. May somebody else will post here
if it's possible.
> 
> But you could maybe workaround that by doing a mapping from regex to region in your client
code.
> 
> If that's not an option and it's too difficult to decide how to pre-split you could rely
on auto-splitting that occurs when hbase.hregion.max.filesize is reached.
> 
> 
> A often helpful online reference: http://hbase.apache.org/book.html  -> see 2.8.2.7.
Managed Splitting
> 
> regards
> Chris
> 
> 
> 
> 
> ________________________________
> Von: Prakrati Agrawal <Prakrati.Agrawal@mu-sigma.com>
> An: "user@hbase.apache.org" <user@hbase.apache.org>; Christian Schäfer <syrious3000@yahoo.de>

> Gesendet: 6:30 Donnerstag, 5.Juli 2012
> Betreff: RE: Presplit regions when creating a table
> 
> Hi
> 
> Can I do splits on regular expressions instead of specific keys? For example, keys having
a particular pattern go to node#1 and others go to node#2 etc.
> 
> Thanks and Regards
> Prakrati
> -----Original Message-----
> From: Christian Schäfer [mailto:syrious3000@yahoo.de]
> Sent: Wednesday, July 04, 2012 5:14 PM
> To: user@hbase.apache.org
> Subject: Re: Presplit regions when creating a table
> 
> Simplest way to pre-split a table is on table creation using the hbase shell by specifying
the key-splits.
> 
> This could look like this: create 'mytable', 'myfamily', {SPLITS => ['111111', '222222',
'333333', '444444']}
> 
> resulting in 5 regions: [below-111111[ , [111111-222222[, [222222-333333[, [333333-444444[,
[444444-above[
> 
> If you have  a limited amount of attributes you store per row you should consider using
OpenTSDB that's built on top of hbase and aims on time series data.
> 
> regards
> Chris
> 
> 
> 
> ----- Ursprüngliche Message -----
> Von: Prakrati Agrawal <Prakrati.Agrawal@mu-sigma.com>
> An: "user@hbase.apache.org" <user@hbase.apache.org>
> CC:
> Gesendet: 13:23 Mittwoch, 4.Juli 2012
> Betreff: Presplit regions when creating a table
> 
> Dear  all,
> 
> I am using Hbase 0.90.6
> I have a streaming data which I want to store in Hbase table. I thought of the row key
design as "typeString_date_Id" where typeString is of 5 types.  Now the problem is that the
types are not evenly distributed i.e I have 1 type a lot more than another type due to which
if I start inserting the data, I will see hotspotting in some region servers as compared to
others. To avoid this, I thought I will presplit the regions. I am not understanding how to
use the region splitter to my benefit. Can I get a code snippet on how to do it. I am using
RegionSplitter interface to do the same.
> 
> Thanks
> Prakrati
> 
> ________________________________
> This email message may contain proprietary, private and confidential information. The
information transmitted is intended only for the person(s) or entities to which it is addressed.
Any review, retransmission, dissemination or other use of, or taking of any action in reliance
upon, this information by persons or entities other than the intended recipient is prohibited
and may be illegal. If you received this in error, please contact the sender and delete the
message from your system.
> 
> Mu Sigma takes all reasonable steps to ensure that its electronic communications are
free from viruses. However, given Internet accessibility, the Company cannot accept liability
for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date
virus checking software.
> 
> 
> This email message may contain proprietary, private and confidential information. The
information transmitted is intended only for the person(s) or entities to which it is addressed.
Any review, retransmission, dissemination or other use of, or taking of any action in reliance
upon, this information by persons or entities other than the intended recipient is prohibited
and may be illegal. If you received this in error, please contact the sender and delete the
message from your system.
> 
> Mu Sigma takes all reasonable steps to ensure that its electronic communications are
free from viruses. However, given Internet accessibility, the Company cannot accept liability
for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date
virus checking software.
> 


Mime
View raw message