accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dylan Hutchison (JIRA)" <>
Subject [jira] [Created] (ACCUMULO-3751) Iterator Redesign
Date Sun, 26 Apr 2015 02:31:39 GMT
Dylan Hutchison created ACCUMULO-3751:

             Summary: Iterator Redesign
                 Key: ACCUMULO-3751
             Project: Accumulo
          Issue Type: Wish
          Components: tserver
            Reporter: Dylan Hutchison
             Fix For: 2.0.0

Many Accumulo users have pointed out issues and places for improvement in the iterator stack
formed atop the [SortedKeyValueIterator|]
interface. This issue aims to gather thoughts and requirements on what would make a new iterator
stack, ideally reverse compatible with the current stack.

h3. {{close}} method for iterators
See ACCUMULO-1280. Iterators do not have full lifecycle control, since a tablet server may
"tear down" an iterator after it returns from a {{seek}} or {{next}} call. Iterators that
start other threads, access external resources or perform some other action requiring cleanup
must either initialize and tear down those actions within a call to {{seek}} or {{next}},
which is usually prohibitively expensive, or they keep state anyway and "hope for the best,"
possibly by putting cleanup code in the {{finalize}} method, which is not guaranteed to be
called by the JVM but is a better option than nothing.

Current advice to iterator writers is to "not do" these kinds of operations that require stateful
cleanup.  Adding a {{close}}-like method that the tablet server guarantees will be called
(via try-finally) before an iterator is torn down for any reason would make these iterators
much more stable and easier to write.

[~billie.rinaldi] has suggested using the [Closeable|]
or [AutoCloseable|]
interface.  The tablet server could call {{close()}} on any SKVI that also implements AutoCloseable.

It would also be nice for the iterator to know _why_ it is being closed, e.g., because
* the scan/compact range on the current tablet finished
* the source is switching
* the scan batch finished, and we're waiting for the client to request more batches
* some interrupt occurred (?)
* to give CPU time or memory to other iterators for fairness (?)

Such a reason could be passed to the iterator in the same way that the tablet server has a

h3. Iterator Performance
Some have noticed that Accumulo is CPU-bound because it bottlenecks on the large numbers of
serialize/deserialize operations, object creations and data copying present in the iterator
stack. [~afuchs] made a bunch of changes to system iterators that increased performance significantly
in ACCUMULO-3079.

We may want to consider more fundamental changes, like putting the row, column family, qualifier,
visibility, delete marker and timestamp in a single byte[] with a field delimitter byteacter,
rather than keeping them in separate byte[]s. This gives an added bonus of easy key comparisons.
Why not also store the Value with the Key rather than split the two into separate objects?
 Imagine a {{getTopEntry}} operation that returns a byte[] that holds all the components of
the Key and Value.

We should adopt a philosophy of "reuse/alias byte[] buffers as often as possible," copying
only when we need to save a copy. Imagine one extreme where we pass a single byte[] down the
iterator stack rather than a Key or Value wrapping scattered buffers. If we were to consider
changes along this route, we ought to create features that make it easy to grab and manipulate
data in the byte[] as easily as the Key and Value objects, perhaps through well-documented
static methods. This is critical for usability since users are used to object-oriented style
manipulations of Key and Value.
For reverse compatibility, create an interface that extends SKVI and has methods for passing
byte[] references directly, converting the byte[] to old Key/Value objects for iterators that
do not implement the interface extension.

h3. State-save/restore for iterators
The current information an iterator has to build up the state it needs are (1) the {{Map<String,String>}}
options passed to init, (2) the [IteratorEnvironment|]
passed to init, and (3) the range passed to seek. When an iterator is torn down, the seek
call after it is next reconstructed has a start key equal to the last key returned (non-inclusive).

There are a few ways we could imagine letting iterators save their state. 
# We could allow an iterator to add to or even modify the {{Map<String,String>}} options
passed to init. This should be sufficient for most iterators to save their state and reload.
# ACCUMULO-625 proposes a kind of "state cookie" that is emitted from an iterator (maybe as
the return value of a {{close}} method) and is sent back to the client, so that a client could
re-start a scan with this cookie and re-create the iterator in that state at the tablet server.
This seems more powerful but perhaps harder engineering than #1.

h3. Iterator safety
[~jstoneham] put forward the idea of encapsulating user iterators in a security manager in

[~kturner] had an idea for running iterators in separate processes, and then suggested using
tablet server rolling restarts to handle failing iterators.

[~elserj] thought about giving long-running iterators the ability to stop their processing
when their scan thread is interrupted but before the iterator returns in ACCUMULO-3348. Similar
to the AutoCloseable suggestion above, we may realize this by checking whether iterators implement
and calling their {{setInterruptFlag}} method when they need interruption.

This message was sent by Atlassian JIRA

View raw message