hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramkrishna.S.Vasudevan" <ramkrishna.vasude...@huawei.com>
Subject RE: reduce influence of auto-splitting region
Date Thu, 06 Sep 2012 06:15:12 GMT
Yes.  The row keys generated should be falling in the range of one of the
region's start and end key .  So HBase internally can take care of
distributing to the specified region server.
As mentioned in http://hbase.apache.org/book/perf.writing.html, we also need
to take care of not making one particular region  as hot region.

If suppose the data for a span of 30 mins is collected and then it is passed
on to HBase then the client can be written in such a way like the puts are
equally distributed to the regions that comprises the 30 mins data.

Hope this helps.

Regards
Ram

> -----Original Message-----
> From: jing wang [mailto:happygodwithwang@gmail.com]
> Sent: Wednesday, September 05, 2012 8:00 PM
> To: user@hbase.apache.org
> Subject: Re: reduce influence of auto-splitting region
> 
> Hi Ram,
> 
>   How to drive the data to the specific hourly region? Use the code
> like
> http://hbase.apache.org/book/perf.writing.html?
> 
> 
> Thanks,
> Jing Wang
> 
> 2012/9/5 Ramkrishna.S.Vasudevan <ramkrishna.vasudevan@huawei.com>
> 
> > Hi JingWang
> >
> > It is not necessary that region split can cause GC problems.  Based
> on your
> > use case we may need to configure heapspace for the RS.
> > Coming back to region splits, presplit of the tables created is a
> good
> > option.
> > Assume a case where I know that the data that is going to come into
> hbase
> > is
> > on a hourly basis.  Then one option could be presplit your table
> based on
> > the hours and assign the regions in roundrobin fashion to every RS.
> > This will ensure that any particular hours data will go into one
> region
> > specified for that hour only.  So after that hour is over the data
> will be
> > moving over to another region server.
> > But here again every hour can be split equally into the different RS
> like 5
> > or 10 regions with in an hour.
> > These are some ways, but should be chosen as per the data that your
> cluster
> > will be operating upon.
> >
> > Regards
> > Ram
> >
> > > -----Original Message-----
> > > From: jing wang [mailto:happygodwithwang@gmail.com]
> > > Sent: Wednesday, September 05, 2012 6:42 PM
> > > To: user@hbase.apache.org
> > > Subject: Re: reduce influence of auto-splitting region
> > >
> > > Hi Ram,
> > >
> > > Thanks for your advice. We did consider what you said.
> > > As Hbase is used as a realtime storage,just like mysql/oracle. When
> > > splitted, hbase may lead gc to 'stop the world' or some long time
> full
> > > gc.
> > > Our application can't accpet this.
> > >
> > > Thanks,
> > > Jing Wang
> > >
> > > 2012/9/5 Ramkrishna.S.Vasudevan <ramkrishna.vasudevan@huawei.com>
> > >
> > > > You can use the property hbase.hregion.max.filesize.  You can set
> > > this to a
> > > > higher value and control the splits through your application.
> > > >
> > > > Regards
> > > > Ram
> > > >
> > > > > -----Original Message-----
> > > > > From: jing wang [mailto:happygodwithwang@gmail.com]
> > > > > Sent: Wednesday, September 05, 2012 3:48 PM
> > > > > To: user@hbase.apache.org
> > > > > Subject: reduce influence of auto-splitting region
> > > > >
> > > > > Hi there,
> > > > >
> > > > >   Using Hbase as a realtime storage(7*24h), how to reduce the
> > > influence
> > > > > of
> > > > > region auto-splitting?
> > > > >   Any advice will be appreciated!
> > > > >
> > > > >
> > > > > Thanks,
> > > > > Jing
> > > >
> > > >
> >
> >


Mime
View raw message