accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From els...@apache.org
Subject [1/3] accumulo git commit: ACCUMULO-3646 Update iterator doc on entries past seek()
Date Tue, 28 Apr 2015 03:11:33 GMT
Repository: accumulo
Updated Branches:
  refs/heads/1.7 66075c3ea -> 481b11922
  refs/heads/master cf0bcb3fb -> e9ad5345f


ACCUMULO-3646 Update iterator doc on entries past seek()

Add warning to SKVI javadoc and to iterator chapter in Accumulo manual.
Fix inconsistency with Javadoc in iterator chapter.
Add warning on aliasing of getTopKey()/getTopValue() to chapter.

Closes #33

Signed-off-by: Josh Elser <elserj@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/481b1192
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/481b1192
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/481b1192

Branch: refs/heads/1.7
Commit: 481b11922ff8aec26168a5afebcca5fe72c5ec4b
Parents: 66075c3
Author: Dylan Hutchison <dhutchis@mit.edu>
Authored: Sat Apr 25 20:02:04 2015 -0400
Committer: Josh Elser <elserj@apache.org>
Committed: Mon Apr 27 23:03:55 2015 -0400

----------------------------------------------------------------------
 .../core/iterators/SortedKeyValueIterator.java  |  8 ++++++
 .../main/asciidoc/chapters/iterator_design.txt  | 27 +++++++++++++++-----
 2 files changed, 29 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
----------------------------------------------------------------------
diff --git a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
index 232dc2a..f6d3170 100644
--- a/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
+++ b/core/src/main/java/org/apache/accumulo/core/iterators/SortedKeyValueIterator.java
@@ -106,6 +106,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V
extend
    * Returns top key. Can be called 0 or more times without affecting behavior of next()
or hasTop(). Note that in minor compaction scope and in non-full major
    * compaction scopes the iterator may see deletion entries. These entries should be preserved
by all iterators except ones that are strictly scan-time
    * iterators that will never be configured for the minc or majc scopes. Deletion entries
are only removed during full major compactions.
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopKey</tt>
when {@link #next()} is called, changing the data
+   * that the object references. Iterators that need to save an object returned by <tt>getTopKey</tt>
ought to copy the object's data into a new object
+   * in order to avoid aliasing bugs.
    *
    * @return <tt>K</tt>
    * @exception IllegalStateException
@@ -117,6 +121,10 @@ public interface SortedKeyValueIterator<K extends WritableComparable<?>,V
extend
 
   /**
    * Returns top value. Can be called 0 or more times without affecting behavior of next()
or hasTop().
+   * <p>
+   * For performance reasons, iterators reserve the right to reuse objects returned by <tt>getTopValue</tt>
when {@link #next()} is called, changing the
+   * underlying data that the object references. Iterators that need to save an object returned
by <tt>getTopValue</tt> ought to copy the object's data
+   * into a new object in order to avoid aliasing bugs.
    *
    * @return <tt>V</tt>
    * @exception IllegalStateException

http://git-wip-us.apache.org/repos/asf/accumulo/blob/481b1192/docs/src/main/asciidoc/chapters/iterator_design.txt
----------------------------------------------------------------------
diff --git a/docs/src/main/asciidoc/chapters/iterator_design.txt b/docs/src/main/asciidoc/chapters/iterator_design.txt
index b4c1c69..02c9973 100644
--- a/docs/src/main/asciidoc/chapters/iterator_design.txt
+++ b/docs/src/main/asciidoc/chapters/iterator_design.txt
@@ -119,8 +119,9 @@ Range. For example, a regular expression Iterator would consume all records
whic
 pattern before returning from `seek`.
 
 It is important to retain the original Range passed to this method to know when this Iterator
should stop
-reading more Key-Value pairs. Ignoring this will typically not result in errors; however,
it will result
-in wasted time from unnecessary computation.
+reading more Key-Value pairs. Ignoring this typically does not affect scans from a Scanner,
but it
+will result in duplicate keys emitting from a BatchScan if the scanned table has more than
one tablet.
+Best practice is to never emit entries outside the seek range.
 
 ==== `next`
 
@@ -145,8 +146,21 @@ alter the internal state of the Iterator.
 
 These methods simply return the current Key-Value pair for this iterator. If `hasTop` returns
true,
 both of these methods should return non-null objects. If `hasTop` returns false, it is undefined
-what these methods should return. Multiple calls to these methods should not alter the state
-of the Iterator like `hasTop`.
+what these methods should return. Like `hasTop`, multiple calls to these methods should not
alter
+the state of the Iterator.
+
+Users should take caution when either
+
+1. caching the Key/Value from `getTopKey`/`getTopValue`, for use after calling `next` on
the source iterator.
+In this case, the cached Key/Value object is aliased to the reference returned by the source
iterator.
+Iterators may reuse the same Key/Value object in a `next` call for performance reasons, changing
the data
+that the cached Key/Value object references and resulting in a logic bug.
+2. modifying the Key/Value from `getTopKey`/`getTopValue`. If the source iterator reuses
data stored in the Key/Value,
+then the source iterator may use the modified data that the Key/Value references. This may/may
not result in a logic bug.
+
+In both cases, copying the Key/Value's data into a new object ensures iterator correctness.
If neither case applies,
+it is safe to not copy the Key/Value.  The general guideline is to be aware of who else may
use Key/Value objects
+returned from `getTopKey`/`getTopValue`.
 
 ==== `deepCopy`
 
@@ -154,8 +168,9 @@ The `deepCopy` method is similar to the `clone` method from the Java `Cloneable`
 Implementations of this method should return a new object of the same type as the Accumulo
Iterator
 instance it was called on. Any internal state from the instance `deepCopy` was called
 on should be carried over to the returned copy. The returned copy should be ready to have
-`seek` called on it. It is unspecified if `init` will be called on the original object or
the
-copy before or after `deepCopy` is called.
+`seek` called on it. The SortedKeyValueIterator interface guarantees that `init` will be
called on
+an iterator before `deepCopy` and that `init` will not be called on the iterator returned
by
+`deepCopy`.
 
 Typically, implementations of `deepCopy` call a copy-constructor which will initialize
 internal data structures. As with `seek`, it is common for the `IteratorEnvironment`


Mime
View raw message