accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Malaska <ted.mala...@cloudera.com>
Subject Re: HBase and Accumulo
Date Wed, 19 Aug 2015 23:01:20 GMT
That being send what is the use case that u feel you need a nosql solution
for?
On Aug 19, 2015 6:54 PM, "Ted Malaska" <ted.malaska@cloudera.com> wrote:

> I'm on the side of benchmarking for the use case and with an expert.
> There a so many ways to cheat a benchmark.  And the bench mark may not be
> anything like your use case.
> On Aug 19, 2015 5:43 PM, "Andrew Purtell" <apurtell@apache.org> wrote:
>
>> I think someone who uses third party benchmarks to assess a system like
>> HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so
>> perhaps we must agree to disagree.
>>
>>
>> On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner <kepner@ll.mit.edu> wrote:
>>
>> > I agree, that performance on real apps is the most important for
>> > any particular organization, but as technologists how do we measure
>> > ourselves?
>> > Hence imperfect benchmarking remains our only recourse.
>> >
>> > On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote:
>> > > I can't speak for anyone other than myself in the HBase community, but
>> > I'm
>> > > much more interested and focused on performance analysis and
>> > > developing/deploying for the use cases of my employer than
>> participating
>> > in
>> > > generic bench-marketing to make weapons for happy OSS warriors.
>> Perhaps
>> > > this does a disservice to the HBase project overall and if so then I
>> > > apologize to others on the project for that.
>> > >
>> > > That said, from long and bitter experience let me state the only
>> > benchmarks
>> > > that every really matter are the comparative benchmarks you make for
>> your
>> > > own use cases in your own environments, preferably exercising those
>> > > candidates with real data and operating conditions. See:
>> > > https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile)
>> > >
>> > >
>> > >
>> > > On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser <josh.elser@gmail.com>
>> > wrote:
>> > >
>> > > > Alright, I have to ask... are you referring to the paper that cites
>> > > > Accumulo performance without write-ahead logs enabled? I have some
>> > serious
>> > > > reservations about the relevance of that paper to this conversation
>> and
>> > > > just want to make sure people aren't led astray by what the actual
>> > takeaway
>> > > > should be.
>> > > >
>> > > > Jeremy Kepner wrote:
>> > > >
>> > > >> A big difference between Accumulo and HBase is the published
>> > performance
>> > > >> numbers.
>> > > >> The Accumulo community has done a good job of continuing to publish
>> > > >> up-to-date performance
>> > > >> numbers in peer-reviewed venues which allow Accumulo to claim
best
>> in
>> > the
>> > > >> world performance.
>> > > >>
>> > > >> The HBase community hasn't been doing that so much.  It would
be
>> > great if
>> > > >> they did because
>> > > >> the HBase points on the graphs are old and it would be good to
get
>> new
>> > > >> ones.
>> > > >>
>> > > >>
>> > > >>
>> > > >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote:
>> > > >>
>> > > >>> Like I've said many times now, it's relative to your actual
>> problem.
>> > > >>> If you don't have that much data (or intend to grow into that
much
>> > > >>> data), it's not an issue. Obviously, this is the case for
you.
>> > > >>>
>> > > >>> However, it is an architectural difference between the two
>> projects
>> > > >>> with known limitations for a single metadata region. It's
a
>> > > >>> difference as what was asked for by Jerry.
>> > > >>>
>> > > >>> Ted Malaska wrote:
>> > > >>>
>> > > >>>> I've been doing HBase for a long time and never had an
issue with
>> > region
>> > > >>>> count limits and I have clusters with 10s of billions
of records.
>> > Many
>> > > >>>> there would be issues around a couple Trillion records,
but never
>> > got
>> > > >>>> that
>> > > >>>> high yet.
>> > > >>>>
>> > > >>>> Ted Malaska
>> > > >>>>
>> > > >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<josh.elser@gmail.com
>> >
>> > > >>>>  wrote:
>> > > >>>>
>> > > >>>> Oh, one other thing that I should mention (was prompted
>> off-list).
>> > > >>>>>
>> > > >>>>> (definition time since cross-list now: HBase regions
== Accumulo
>> > > >>>>> tablets)
>> > > >>>>>
>> > > >>>>> Accumulo will handle many more regions than HBase
does now due
>> to a
>> > > >>>>> splittable metadata table. While I was told this was
a very long
>> > and
>> > > >>>>> arduous journey to implement correctly (WRT splitting,
merges
>> and
>> > bulk
>> > > >>>>> loading), users with "too many regions" problems are
extremely
>> few
>> > and
>> > > >>>>> far
>> > > >>>>> between for Accumulo.
>> > > >>>>>
>> > > >>>>> I was very happy to see effort/design being put into
this in
>> HBase.
>> > > >>>>> And,
>> > > >>>>> just to be fair in criticism/praises, HBase does appear
to me
>> to do
>> > > >>>>> assignments of regions much faster than Accumulo does
on a small
>> > > >>>>> cluster
>> > > >>>>> (~5-10 nodes). Accumulo may take a few seconds to
notice and
>> > reassign
>> > > >>>>> tablets. I have yet to notice this with HBase (which
also could
>> be
>> > due
>> > > >>>>> to
>> > > >>>>> lack of personal testing).
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> Jerry He wrote:
>> > > >>>>>
>> > > >>>>> Hi, folks
>> > > >>>>>>
>> > > >>>>>> We have people that are evaluating HBase vs Accumulo.
>> > > >>>>>> Security is an important factor.
>> > > >>>>>>
>> > > >>>>>> But I think after the Cell security was added
in HBase, there
>> is
>> > no
>> > > >>>>>> more
>> > > >>>>>> real gap compared to Accumulo.
>> > > >>>>>>
>> > > >>>>>> I know we have both HBase and Accumulo experts
on this list.
>> > > >>>>>> Could someone shred more light?
>> > > >>>>>> I am looking for real gap comparing HBase to Accumulo
if there
>> is
>> > any
>> > > >>>>>> so
>> > > >>>>>> that I can be prepared to address them. This is
not limited to
>> the
>> > > >>>>>> security
>> > > >>>>>> area.
>> > > >>>>>>
>> > > >>>>>> There are differences in some features and implementations.
But
>> > they
>> > > >>>>>> don't
>> > > >>>>>> see like real 'gaps'.
>> > > >>>>>>
>> > > >>>>>> Any comments and feedbacks are welcome.
>> > > >>>>>>
>> > > >>>>>> Thanks,
>> > > >>>>>>
>> > > >>>>>> Jerry
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>>
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > >
>> > >    - Andy
>> > >
>> > > Problems worthy of attack prove their worth by hitting back. - Piet
>> Hein
>> > > (via Tom White)
>> >
>>
>>
>>
>> --
>> Best regards,
>>
>>    - Andy
>>
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message