accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Unexpected aliasing from RFile getTopValue()
Date Wed, 15 Apr 2015 15:52:33 GMT

Keith Turner wrote:
> On Wed, Apr 15, 2015 at 11:06 AM, Adam Fuchs <
> <>> wrote:
>     On Wed, Apr 15, 2015 at 10:20 AM, Keith Turner <
>     <>> wrote:
>         Random thought on revamp.  Immutable key values with enough
>         primitives to make most operations efficient (avoid constant
>         alloc/copy) might be something to consider for the iterator API
>     So, is this a tradeoff in the performance vs. inter-iterator
>     isolation space? From a performance perspective we would do best if
>     we just passed around pointers to an underlying byte array (e.g.
>     ByteBuffer-style), but maximum
> There are performance implications to consider key/vals not being
> immutable.  Currently if any iterator wants to keep a key/val to compare
> it later key vals, then it has to copy it. I think some iterators do
> this frequently.  I am not making the assertion that immutable would
> perform better, I don't know.
>     isolation would require never reusing anything returned from an
>     iterator's getTopX methods. From a security perspective we need to
>     be careful with how we reuse data objects (hence the need for the
>     SynchronizedIterator at the top of the "system" iterators), but I
>     would say we can probably relax other isolation concerns in the
>     iterators in favor of performance.
>     I think there's probably a bigger project here around minimizing the
>     object creation, data copying, serialization, and deserialization of
>     keys. We did some work that Chris McCubbin will be presenting at the
>     upcoming accumulo summit around pushing key comparisons down to a
>     serialized form of the key, and that made a huge impact on load
>     performance. I think we could probably achieve an order of magnitude
>     more throughput in the iterator tree with a major refactoring. Any
>     thoughts on when we might have the appetite for such a change? If
>     we're thinking about making key/values immutable then we might
>     piggyback a bigger redesign on that already breaking change.
> If we were to introduce an improved iterator API, i would hope we could
> deprecate and still support the old API.

Strong +1

>     Adam

View raw message