hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry He <jerry...@gmail.com>
Subject Re: HBase and Accumulo
Date Wed, 19 Aug 2015 23:15:39 GMT
Hi, folks

Thanks so much for all the responses and comments.

We don't have or support Accumulo yet  We support HBase.  There have been
requests for Accumulo. Like Ted said, almost all from Federal sector and
Banks (even foreign banks).
They seem to have References or reference implementations for their use
cases.  My work of persuasion for HBase has not been very successful.

I had looked into the HBase cell security. There are maybe some differences
and misses like Sean mentioned. I think overall the visibility coverage
plus the ACL are great.

Technology aside, Accumulo's reputation in the specific areas it is good at
is probably there.

It will probably be slow evolving process ...

Jerry



On Wed, Aug 19, 2015 at 3:54 PM, Ted Malaska <ted.malaska@cloudera.com>
wrote:

> I'm on the side of benchmarking for the use case and with an expert.  There
> a so many ways to cheat a benchmark.  And the bench mark may not be
> anything like your use case.
> On Aug 19, 2015 5:43 PM, "Andrew Purtell" <apurtell@apache.org> wrote:
>
> > I think someone who uses third party benchmarks to assess a system like
> > HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so
> > perhaps we must agree to disagree.
> >
> >
> > On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner <kepner@ll.mit.edu>
> wrote:
> >
> > > I agree, that performance on real apps is the most important for
> > > any particular organization, but as technologists how do we measure
> > > ourselves?
> > > Hence imperfect benchmarking remains our only recourse.
> > >
> > > On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote:
> > > > I can't speak for anyone other than myself in the HBase community,
> but
> > > I'm
> > > > much more interested and focused on performance analysis and
> > > > developing/deploying for the use cases of my employer than
> > participating
> > > in
> > > > generic bench-marketing to make weapons for happy OSS warriors.
> Perhaps
> > > > this does a disservice to the HBase project overall and if so then I
> > > > apologize to others on the project for that.
> > > >
> > > > That said, from long and bitter experience let me state the only
> > > benchmarks
> > > > that every really matter are the comparative benchmarks you make for
> > your
> > > > own use cases in your own environments, preferably exercising those
> > > > candidates with real data and operating conditions. See:
> > > > https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile)
> > > >
> > > >
> > > >
> > > > On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser <josh.elser@gmail.com>
> > > wrote:
> > > >
> > > > > Alright, I have to ask... are you referring to the paper that cites
> > > > > Accumulo performance without write-ahead logs enabled? I have some
> > > serious
> > > > > reservations about the relevance of that paper to this conversation
> > and
> > > > > just want to make sure people aren't led astray by what the actual
> > > takeaway
> > > > > should be.
> > > > >
> > > > > Jeremy Kepner wrote:
> > > > >
> > > > >> A big difference between Accumulo and HBase is the published
> > > performance
> > > > >> numbers.
> > > > >> The Accumulo community has done a good job of continuing to
> publish
> > > > >> up-to-date performance
> > > > >> numbers in peer-reviewed venues which allow Accumulo to claim
best
> > in
> > > the
> > > > >> world performance.
> > > > >>
> > > > >> The HBase community hasn't been doing that so much.  It would
be
> > > great if
> > > > >> they did because
> > > > >> the HBase points on the graphs are old and it would be good to
get
> > new
> > > > >> ones.
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote:
> > > > >>
> > > > >>> Like I've said many times now, it's relative to your actual
> > problem.
> > > > >>> If you don't have that much data (or intend to grow into
that
> much
> > > > >>> data), it's not an issue. Obviously, this is the case for
you.
> > > > >>>
> > > > >>> However, it is an architectural difference between the two
> projects
> > > > >>> with known limitations for a single metadata region. It's
a
> > > > >>> difference as what was asked for by Jerry.
> > > > >>>
> > > > >>> Ted Malaska wrote:
> > > > >>>
> > > > >>>> I've been doing HBase for a long time and never had an
issue
> with
> > > region
> > > > >>>> count limits and I have clusters with 10s of billions
of
> records.
> > > Many
> > > > >>>> there would be issues around a couple Trillion records,
but
> never
> > > got
> > > > >>>> that
> > > > >>>> high yet.
> > > > >>>>
> > > > >>>> Ted Malaska
> > > > >>>>
> > > > >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<
> josh.elser@gmail.com>
> > > > >>>>  wrote:
> > > > >>>>
> > > > >>>> Oh, one other thing that I should mention (was prompted
> off-list).
> > > > >>>>>
> > > > >>>>> (definition time since cross-list now: HBase regions
==
> Accumulo
> > > > >>>>> tablets)
> > > > >>>>>
> > > > >>>>> Accumulo will handle many more regions than HBase
does now due
> > to a
> > > > >>>>> splittable metadata table. While I was told this
was a very
> long
> > > and
> > > > >>>>> arduous journey to implement correctly (WRT splitting,
merges
> and
> > > bulk
> > > > >>>>> loading), users with "too many regions" problems
are extremely
> > few
> > > and
> > > > >>>>> far
> > > > >>>>> between for Accumulo.
> > > > >>>>>
> > > > >>>>> I was very happy to see effort/design being put into
this in
> > HBase.
> > > > >>>>> And,
> > > > >>>>> just to be fair in criticism/praises, HBase does
appear to me
> to
> > do
> > > > >>>>> assignments of regions much faster than Accumulo
does on a
> small
> > > > >>>>> cluster
> > > > >>>>> (~5-10 nodes). Accumulo may take a few seconds to
notice and
> > > reassign
> > > > >>>>> tablets. I have yet to notice this with HBase (which
also could
> > be
> > > due
> > > > >>>>> to
> > > > >>>>> lack of personal testing).
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> Jerry He wrote:
> > > > >>>>>
> > > > >>>>> Hi, folks
> > > > >>>>>>
> > > > >>>>>> We have people that are evaluating HBase vs Accumulo.
> > > > >>>>>> Security is an important factor.
> > > > >>>>>>
> > > > >>>>>> But I think after the Cell security was added
in HBase, there
> is
> > > no
> > > > >>>>>> more
> > > > >>>>>> real gap compared to Accumulo.
> > > > >>>>>>
> > > > >>>>>> I know we have both HBase and Accumulo experts
on this list.
> > > > >>>>>> Could someone shred more light?
> > > > >>>>>> I am looking for real gap comparing HBase to
Accumulo if there
> > is
> > > any
> > > > >>>>>> so
> > > > >>>>>> that I can be prepared to address them. This
is not limited to
> > the
> > > > >>>>>> security
> > > > >>>>>> area.
> > > > >>>>>>
> > > > >>>>>> There are differences in some features and implementations.
> But
> > > they
> > > > >>>>>> don't
> > > > >>>>>> see like real 'gaps'.
> > > > >>>>>>
> > > > >>>>>> Any comments and feedbacks are welcome.
> > > > >>>>>>
> > > > >>>>>> Thanks,
> > > > >>>>>>
> > > > >>>>>> Jerry
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message