accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Malaska <ted.mala...@cloudera.com>
Subject Re: HBase and Accumulo
Date Wed, 19 Aug 2015 19:40:25 GMT
Hbase region splits can be done through a variety of strategies. Data size
can be a component in those strategies. There's no hard and fast rule of
how large a region can be. There's some tradeoffs with larger or smaller
region sizes. A region split strategy will depend upon a number of factors.
Memstore use, scan parellelism, compaction strategies, Dara size an
hardware.
On Aug 19, 2015 3:06 PM, "Christopher" <ctubbsii@apache.org> wrote:

> Forgive my ignorance about HBase, but wouldn't size of records count,
> also? Your response seems to imply that number of records is what
> matters for how many regions are needed. For what it's worth,
> Accumulo's tablets are split based on storage size, not number of
> records. I assumed the same was true for HBase. Am I wrong?
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, Aug 19, 2015 at 2:28 PM, Ted Malaska <ted.malaska@cloudera.com>
> wrote:
> > I've been doing HBase for a long time and never had an issue with region
> > count limits and I have clusters with 10s of billions of records.  Many
> > there would be issues around a couple Trillion records, but never got
> that
> > high yet.
> >
> > Ted Malaska
> >
> > On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser <josh.elser@gmail.com>
> wrote:
> >
> >> Oh, one other thing that I should mention (was prompted off-list).
> >>
> >> (definition time since cross-list now: HBase regions == Accumulo
> tablets)
> >>
> >> Accumulo will handle many more regions than HBase does now due to a
> >> splittable metadata table. While I was told this was a very long and
> >> arduous journey to implement correctly (WRT splitting, merges and bulk
> >> loading), users with "too many regions" problems are extremely few and
> far
> >> between for Accumulo.
> >>
> >> I was very happy to see effort/design being put into this in HBase. And,
> >> just to be fair in criticism/praises, HBase does appear to me to do
> >> assignments of regions much faster than Accumulo does on a small cluster
> >> (~5-10 nodes). Accumulo may take a few seconds to notice and reassign
> >> tablets. I have yet to notice this with HBase (which also could be due
> to
> >> lack of personal testing).
> >>
> >>
> >> Jerry He wrote:
> >>
> >>> Hi, folks
> >>>
> >>> We have people that are evaluating HBase vs Accumulo.
> >>> Security is an important factor.
> >>>
> >>> But I think after the Cell security was added in HBase, there is no
> more
> >>> real gap compared to Accumulo.
> >>>
> >>> I know we have both HBase and Accumulo experts on this list.
> >>> Could someone shred more light?
> >>> I am looking for real gap comparing HBase to Accumulo if there is any
> so
> >>> that I can be prepared to address them. This is not limited to the
> >>> security
> >>> area.
> >>>
> >>> There are differences in some features and implementations. But they
> don't
> >>> see like real 'gaps'.
> >>>
> >>> Any comments and feedbacks are welcome.
> >>>
> >>> Thanks,
> >>>
> >>> Jerry
> >>>
> >>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message