hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael Sitruk <mikael.sit...@gmail.com>
Subject Re: Scan performance on a big table as combination of multiple logic tables
Date Tue, 21 Feb 2012 21:57:52 GMT
See inline

On Feb 21, 2012 11:40 PM, "Jean-Daniel Cryans" <jdcryans@apache.org> wrote:
>
> On Tue, Feb 21, 2012 at 1:17 PM, Mikael Sitruk <mikael.sitruk@gmail.com>
wrote:
> > This is interesting J.D. so, is there a limitation on the region size or
> > not?
>
> Your imagination? Like I said nothing blocks you in the code.
>
> > Can it be really any number?
>
> That's what it implies.
>
> > If so beside the collection time is there
> > any impact (perhaps the documentation should be updated too)?
>
> Collection time? You mean GC? Sorry I don't get what you mean.
>

*Sorry, typo mistake (from mobile) I meant compaction not collection

> > Regarding the number of regions you have (14,398) is it for a single RS?
> > What is your number of RS?
>
> Currently 91 in that cluster. It varies :)
>
> We have >200 tables coming all in different sizes.

*Not clear, 91 rs, and 14398 regions in total? Or per RS?
Mikael.S

> J-D
>
> >
> > Mikael.S
> > On Feb 21, 2012 10:09 PM, "Jean-Daniel Cryans" <jdcryans@apache.org>
wrote:
> >
> >> On Sun, Feb 19, 2012 at 1:45 PM, Mikael Sitruk <mikael.sitruk@gmail.com
>
> >> wrote:
> >> > During compaction the region is not out of service.
> >> > According to documentation the max region size for V2 format is 20G
> >> > And now the question: Assuming that 20G is the limit and the number
of
> >> > regions in a single RS should stay low < 500 it means that there is
no
> >> mean
> >> > having RS with more than 10TB of storage to use by HBase (otherwise
> >> > locality will not be achieve for some servers, i also assume that
> >> > compression is used and therefore it compensate the need for
additional
> >> > space for replication)?
> >> > If the max number of region per RS is smaller then the storage size
is
> >> even
> >> > smaller. Is it correct?
> >>
> >> In the documentation 20GB is given as an example of a larger size that
> >> can be supported, but nothing blocks you from going way higher than
> >> that. I've done some import tests and had 100GB regions. It just takes
> >> a while to compact the bigger files.
> >>
> >> Also you can go over 500 regions, in fact one of our clusters has
> >> 14,398 regions right now. It's just a pain to reassign everything when
> >> HBase boots but this is an offline cluster.
> >>
> >> J-D
> >>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message