accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Unexpected aliasing from RFile getTopValue()
Date Wed, 15 Apr 2015 15:41:32 GMT
On Wed, Apr 15, 2015 at 11:06 AM, Adam Fuchs <afuchs@apache.org> wrote:

> On Wed, Apr 15, 2015 at 10:20 AM, Keith Turner <keith@deenlo.com> wrote:
>>
>>
>> Random thought on revamp.  Immutable key values with enough primitives to
>> make most operations efficient (avoid constant alloc/copy) might be
>> something to consider for the iterator API
>>
>>
> So, is this a tradeoff in the performance vs. inter-iterator isolation
> space? From a performance perspective we would do best if we just passed
> around pointers to an underlying byte array (e.g. ByteBuffer-style), but
> maximum
>

There are performance implications to consider key/vals not being
immutable.  Currently if any iterator wants to keep a key/val to compare it
later key vals, then it has to copy it. I think some iterators do this
frequently.  I am not making the assertion that immutable would perform
better, I don't know.


> isolation would require never reusing anything returned from an iterator's
> getTopX methods. From a security perspective we need to be careful with how
> we reuse data objects (hence the need for the SynchronizedIterator at the
> top of the "system" iterators), but I would say we can probably relax other
> isolation concerns in the iterators in favor of performance.
>
> I think there's probably a bigger project here around minimizing the
> object creation, data copying, serialization, and deserialization of keys.
> We did some work that Chris McCubbin will be presenting at the upcoming
> accumulo summit around pushing key comparisons down to a serialized form of
> the key, and that made a huge impact on load performance. I think we could
> probably achieve an order of magnitude more throughput in the iterator tree
> with a major refactoring. Any thoughts on when we might have the appetite
> for such a change? If we're thinking about making key/values immutable then
> we might piggyback a bigger redesign on that already breaking change.
>

If we were to introduce an improved iterator API, i would hope we could
deprecate and still support the old API.


>
> Adam
>

Mime
View raw message