accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Kepner <kep...@ll.mit.edu>
Subject Re: HBase and Accumulo
Date Wed, 19 Aug 2015 21:34:06 GMT
I agree, that performance on real apps is the most important for
any particular organization, but as technologists how do we measure ourselves?
Hence imperfect benchmarking remains our only recourse.

On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote:
> I can't speak for anyone other than myself in the HBase community, but I'm
> much more interested and focused on performance analysis and
> developing/deploying for the use cases of my employer than participating in
> generic bench-marketing to make weapons for happy OSS warriors. Perhaps
> this does a disservice to the HBase project overall and if so then I
> apologize to others on the project for that.
> 
> That said, from long and bitter experience let me state the only benchmarks
> that every really matter are the comparative benchmarks you make for your
> own use cases in your own environments, preferably exercising those
> candidates with real data and operating conditions. See:
> https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile)
> 
> 
> 
> On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser <josh.elser@gmail.com> wrote:
> 
> > Alright, I have to ask... are you referring to the paper that cites
> > Accumulo performance without write-ahead logs enabled? I have some serious
> > reservations about the relevance of that paper to this conversation and
> > just want to make sure people aren't led astray by what the actual takeaway
> > should be.
> >
> > Jeremy Kepner wrote:
> >
> >> A big difference between Accumulo and HBase is the published performance
> >> numbers.
> >> The Accumulo community has done a good job of continuing to publish
> >> up-to-date performance
> >> numbers in peer-reviewed venues which allow Accumulo to claim best in the
> >> world performance.
> >>
> >> The HBase community hasn't been doing that so much.  It would be great if
> >> they did because
> >> the HBase points on the graphs are old and it would be good to get new
> >> ones.
> >>
> >>
> >>
> >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote:
> >>
> >>> Like I've said many times now, it's relative to your actual problem.
> >>> If you don't have that much data (or intend to grow into that much
> >>> data), it's not an issue. Obviously, this is the case for you.
> >>>
> >>> However, it is an architectural difference between the two projects
> >>> with known limitations for a single metadata region. It's a
> >>> difference as what was asked for by Jerry.
> >>>
> >>> Ted Malaska wrote:
> >>>
> >>>> I've been doing HBase for a long time and never had an issue with region
> >>>> count limits and I have clusters with 10s of billions of records.  Many
> >>>> there would be issues around a couple Trillion records, but never got
> >>>> that
> >>>> high yet.
> >>>>
> >>>> Ted Malaska
> >>>>
> >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<josh.elser@gmail.com>
> >>>>  wrote:
> >>>>
> >>>> Oh, one other thing that I should mention (was prompted off-list).
> >>>>>
> >>>>> (definition time since cross-list now: HBase regions == Accumulo
> >>>>> tablets)
> >>>>>
> >>>>> Accumulo will handle many more regions than HBase does now due to
a
> >>>>> splittable metadata table. While I was told this was a very long
and
> >>>>> arduous journey to implement correctly (WRT splitting, merges and
bulk
> >>>>> loading), users with "too many regions" problems are extremely few
and
> >>>>> far
> >>>>> between for Accumulo.
> >>>>>
> >>>>> I was very happy to see effort/design being put into this in HBase.
> >>>>> And,
> >>>>> just to be fair in criticism/praises, HBase does appear to me to
do
> >>>>> assignments of regions much faster than Accumulo does on a small
> >>>>> cluster
> >>>>> (~5-10 nodes). Accumulo may take a few seconds to notice and reassign
> >>>>> tablets. I have yet to notice this with HBase (which also could
be due
> >>>>> to
> >>>>> lack of personal testing).
> >>>>>
> >>>>>
> >>>>> Jerry He wrote:
> >>>>>
> >>>>> Hi, folks
> >>>>>>
> >>>>>> We have people that are evaluating HBase vs Accumulo.
> >>>>>> Security is an important factor.
> >>>>>>
> >>>>>> But I think after the Cell security was added in HBase, there
is no
> >>>>>> more
> >>>>>> real gap compared to Accumulo.
> >>>>>>
> >>>>>> I know we have both HBase and Accumulo experts on this list.
> >>>>>> Could someone shred more light?
> >>>>>> I am looking for real gap comparing HBase to Accumulo if there
is any
> >>>>>> so
> >>>>>> that I can be prepared to address them. This is not limited
to the
> >>>>>> security
> >>>>>> area.
> >>>>>>
> >>>>>> There are differences in some features and implementations.
But they
> >>>>>> don't
> >>>>>> see like real 'gaps'.
> >>>>>>
> >>>>>> Any comments and feedbacks are welcome.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Jerry
> >>>>>>
> >>>>>>
> >>>>>>
> 
> 
> -- 
> Best regards,
> 
>    - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

Mime
View raw message