accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: HBase and Accumulo
Date Wed, 19 Aug 2015 22:47:21 GMT
Ah right, I did forgot about that paper. Thanks for clarifying.

Big +1 to Andy's comments, too.

Jeremy Kepner wrote:
> Turning off the walog was mostly to shorten the benchmarking cycle
> (it allowed us to go from zero to peak ingest in a few seconds).  BAH got
> pretty much the same performance results in their paper,
> it just took longer for their experiments to run.
> So, in this case, we had two different teams doing things different
> ways and getting the same result, which is what we like to see.
>
> On Wed, Aug 19, 2015 at 03:27:07PM -0400, Josh Elser wrote:
>> Alright, I have to ask... are you referring to the paper that cites
>> Accumulo performance without write-ahead logs enabled? I have some
>> serious reservations about the relevance of that paper to this
>> conversation and just want to make sure people aren't led astray by
>> what the actual takeaway should be.
>>
>> Jeremy Kepner wrote:
>>> A big difference between Accumulo and HBase is the published performance numbers.
>>> The Accumulo community has done a good job of continuing to publish up-to-date
performance
>>> numbers in peer-reviewed venues which allow Accumulo to claim best in the world
performance.
>>>
>>> The HBase community hasn't been doing that so much.  It would be great if they
did because
>>> the HBase points on the graphs are old and it would be good to get new ones.
>>>
>>>
>>> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote:
>>>> Like I've said many times now, it's relative to your actual problem.
>>>> If you don't have that much data (or intend to grow into that much
>>>> data), it's not an issue. Obviously, this is the case for you.
>>>>
>>>> However, it is an architectural difference between the two projects
>>>> with known limitations for a single metadata region. It's a
>>>> difference as what was asked for by Jerry.
>>>>
>>>> Ted Malaska wrote:
>>>>> I've been doing HBase for a long time and never had an issue with region
>>>>> count limits and I have clusters with 10s of billions of records.  Many
>>>>> there would be issues around a couple Trillion records, but never got
that
>>>>> high yet.
>>>>>
>>>>> Ted Malaska
>>>>>
>>>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser<josh.elser@gmail.com>
   wrote:
>>>>>
>>>>>> Oh, one other thing that I should mention (was prompted off-list).
>>>>>>
>>>>>> (definition time since cross-list now: HBase regions == Accumulo
tablets)
>>>>>>
>>>>>> Accumulo will handle many more regions than HBase does now due to
a
>>>>>> splittable metadata table. While I was told this was a very long
and
>>>>>> arduous journey to implement correctly (WRT splitting, merges and
bulk
>>>>>> loading), users with "too many regions" problems are extremely few
and far
>>>>>> between for Accumulo.
>>>>>>
>>>>>> I was very happy to see effort/design being put into this in HBase.
And,
>>>>>> just to be fair in criticism/praises, HBase does appear to me to
do
>>>>>> assignments of regions much faster than Accumulo does on a small
cluster
>>>>>> (~5-10 nodes). Accumulo may take a few seconds to notice and reassign
>>>>>> tablets. I have yet to notice this with HBase (which also could be
due to
>>>>>> lack of personal testing).
>>>>>>
>>>>>>
>>>>>> Jerry He wrote:
>>>>>>
>>>>>>> Hi, folks
>>>>>>>
>>>>>>> We have people that are evaluating HBase vs Accumulo.
>>>>>>> Security is an important factor.
>>>>>>>
>>>>>>> But I think after the Cell security was added in HBase, there
is no more
>>>>>>> real gap compared to Accumulo.
>>>>>>>
>>>>>>> I know we have both HBase and Accumulo experts on this list.
>>>>>>> Could someone shred more light?
>>>>>>> I am looking for real gap comparing HBase to Accumulo if there
is any so
>>>>>>> that I can be prepared to address them. This is not limited to
the
>>>>>>> security
>>>>>>> area.
>>>>>>>
>>>>>>> There are differences in some features and implementations. But
they don't
>>>>>>> see like real 'gaps'.
>>>>>>>
>>>>>>> Any comments and feedbacks are welcome.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Jerry
>>>>>>>
>>>>>>>

Mime
View raw message