hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry He <jerry...@gmail.com>
Subject Re: HBase and Accumulo
Date Thu, 20 Aug 2015 01:05:04 GMT
I definitely agree HBase has a broader base.  Thanks, Ted.

Jerry

On Wed, Aug 19, 2015 at 4:42 PM, Ted Malaska <ted.malaska@cloudera.com>
wrote:

> I would say most banks are hbase but there r a few with accumulo.  I have
> most bank, broker dealers and regulators in my region. Also I think we r
> talking about the same foreign bank ;)
>
> Ted Malaska
> On Aug 19, 2015 7:15 PM, "Jerry He" <jerryjch@gmail.com> wrote:
>
> > Hi, folks
> >
> > Thanks so much for all the responses and comments.
> >
> > We don't have or support Accumulo yet  We support HBase.  There have been
> > requests for Accumulo. Like Ted said, almost all from Federal sector and
> > Banks (even foreign banks).
> > They seem to have References or reference implementations for their use
> > cases.  My work of persuasion for HBase has not been very successful.
> >
> > I had looked into the HBase cell security. There are maybe some
> differences
> > and misses like Sean mentioned. I think overall the visibility coverage
> > plus the ACL are great.
> >
> > Technology aside, Accumulo's reputation in the specific areas it is good
> at
> > is probably there.
> >
> > It will probably be slow evolving process ...
> >
> > Jerry
> >
> >
> >
> > On Wed, Aug 19, 2015 at 3:54 PM, Ted Malaska <ted.malaska@cloudera.com>
> > wrote:
> >
> > > I'm on the side of benchmarking for the use case and with an expert.
> > There
> > > a so many ways to cheat a benchmark.  And the bench mark may not be
> > > anything like your use case.
> > > On Aug 19, 2015 5:43 PM, "Andrew Purtell" <apurtell@apache.org> wrote:
> > >
> > > > I think someone who uses third party benchmarks to assess a system
> like
> > > > HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so
> > > > perhaps we must agree to disagree.
> > > >
> > > >
> > > > On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner <kepner@ll.mit.edu>
> > > wrote:
> > > >
> > > > > I agree, that performance on real apps is the most important for
> > > > > any particular organization, but as technologists how do we measure
> > > > > ourselves?
> > > > > Hence imperfect benchmarking remains our only recourse.
> > > > >
> > > > > On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote:
> > > > > > I can't speak for anyone other than myself in the HBase
> community,
> > > but
> > > > > I'm
> > > > > > much more interested and focused on performance analysis and
> > > > > > developing/deploying for the use cases of my employer than
> > > > participating
> > > > > in
> > > > > > generic bench-marketing to make weapons for happy OSS warriors.
> > > Perhaps
> > > > > > this does a disservice to the HBase project overall and if so
> then
> > I
> > > > > > apologize to others on the project for that.
> > > > > >
> > > > > > That said, from long and bitter experience let me state the
only
> > > > > benchmarks
> > > > > > that every really matter are the comparative benchmarks you
make
> > for
> > > > your
> > > > > > own use cases in your own environments, preferably exercising
> those
> > > > > > candidates with real data and operating conditions. See:
> > > > > > https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile)
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser <
> josh.elser@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Alright, I have to ask... are you referring to the paper
that
> > cites
> > > > > > > Accumulo performance without write-ahead logs enabled?
I have
> > some
> > > > > serious
> > > > > > > reservations about the relevance of that paper to this
> > conversation
> > > > and
> > > > > > > just want to make sure people aren't led astray by what
the
> > actual
> > > > > takeaway
> > > > > > > should be.
> > > > > > >
> > > > > > > Jeremy Kepner wrote:
> > > > > > >
> > > > > > >> A big difference between Accumulo and HBase is the
published
> > > > > performance
> > > > > > >> numbers.
> > > > > > >> The Accumulo community has done a good job of continuing
to
> > > publish
> > > > > > >> up-to-date performance
> > > > > > >> numbers in peer-reviewed venues which allow Accumulo
to claim
> > best
> > > > in
> > > > > the
> > > > > > >> world performance.
> > > > > > >>
> > > > > > >> The HBase community hasn't been doing that so much.
 It would
> be
> > > > > great if
> > > > > > >> they did because
> > > > > > >> the HBase points on the graphs are old and it would
be good to
> > get
> > > > new
> > > > > > >> ones.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser
wrote:
> > > > > > >>
> > > > > > >>> Like I've said many times now, it's relative to
your actual
> > > > problem.
> > > > > > >>> If you don't have that much data (or intend to
grow into that
> > > much
> > > > > > >>> data), it's not an issue. Obviously, this is the
case for
> you.
> > > > > > >>>
> > > > > > >>> However, it is an architectural difference between
the two
> > > projects
> > > > > > >>> with known limitations for a single metadata region.
It's a
> > > > > > >>> difference as what was asked for by Jerry.
> > > > > > >>>
> > > > > > >>> Ted Malaska wrote:
> > > > > > >>>
> > > > > > >>>> I've been doing HBase for a long time and never
had an issue
> > > with
> > > > > region
> > > > > > >>>> count limits and I have clusters with 10s of
billions of
> > > records.
> > > > > Many
> > > > > > >>>> there would be issues around a couple Trillion
records, but
> > > never
> > > > > got
> > > > > > >>>> that
> > > > > > >>>> high yet.
> > > > > > >>>>
> > > > > > >>>> Ted Malaska
> > > > > > >>>>
> > > > > > >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<
> > > josh.elser@gmail.com>
> > > > > > >>>>  wrote:
> > > > > > >>>>
> > > > > > >>>> Oh, one other thing that I should mention (was
prompted
> > > off-list).
> > > > > > >>>>>
> > > > > > >>>>> (definition time since cross-list now:
HBase regions ==
> > > Accumulo
> > > > > > >>>>> tablets)
> > > > > > >>>>>
> > > > > > >>>>> Accumulo will handle many more regions
than HBase does now
> > due
> > > > to a
> > > > > > >>>>> splittable metadata table. While I was
told this was a very
> > > long
> > > > > and
> > > > > > >>>>> arduous journey to implement correctly
(WRT splitting,
> merges
> > > and
> > > > > bulk
> > > > > > >>>>> loading), users with "too many regions"
problems are
> > extremely
> > > > few
> > > > > and
> > > > > > >>>>> far
> > > > > > >>>>> between for Accumulo.
> > > > > > >>>>>
> > > > > > >>>>> I was very happy to see effort/design being
put into this
> in
> > > > HBase.
> > > > > > >>>>> And,
> > > > > > >>>>> just to be fair in criticism/praises, HBase
does appear to
> me
> > > to
> > > > do
> > > > > > >>>>> assignments of regions much faster than
Accumulo does on a
> > > small
> > > > > > >>>>> cluster
> > > > > > >>>>> (~5-10 nodes). Accumulo may take a few
seconds to notice
> and
> > > > > reassign
> > > > > > >>>>> tablets. I have yet to notice this with
HBase (which also
> > could
> > > > be
> > > > > due
> > > > > > >>>>> to
> > > > > > >>>>> lack of personal testing).
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> Jerry He wrote:
> > > > > > >>>>>
> > > > > > >>>>> Hi, folks
> > > > > > >>>>>>
> > > > > > >>>>>> We have people that are evaluating
HBase vs Accumulo.
> > > > > > >>>>>> Security is an important factor.
> > > > > > >>>>>>
> > > > > > >>>>>> But I think after the Cell security
was added in HBase,
> > there
> > > is
> > > > > no
> > > > > > >>>>>> more
> > > > > > >>>>>> real gap compared to Accumulo.
> > > > > > >>>>>>
> > > > > > >>>>>> I know we have both HBase and Accumulo
experts on this
> list.
> > > > > > >>>>>> Could someone shred more light?
> > > > > > >>>>>> I am looking for real gap comparing
HBase to Accumulo if
> > there
> > > > is
> > > > > any
> > > > > > >>>>>> so
> > > > > > >>>>>> that I can be prepared to address them.
This is not
> limited
> > to
> > > > the
> > > > > > >>>>>> security
> > > > > > >>>>>> area.
> > > > > > >>>>>>
> > > > > > >>>>>> There are differences in some features
and
> implementations.
> > > But
> > > > > they
> > > > > > >>>>>> don't
> > > > > > >>>>>> see like real 'gaps'.
> > > > > > >>>>>>
> > > > > > >>>>>> Any comments and feedbacks are welcome.
> > > > > > >>>>>>
> > > > > > >>>>>> Thanks,
> > > > > > >>>>>>
> > > > > > >>>>>> Jerry
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > >
> > > > > >    - Andy
> > > > > >
> > > > > > Problems worthy of attack prove their worth by hitting back.
-
> Piet
> > > > Hein
> > > > > > (via Tom White)
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message