accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ibe...@apache.org
Subject [accumulo-website] branch asf-site updated: Jekyll build from master:915b78b
Date Tue, 11 Jun 2019 21:07:35 GMT
This is an automated email from the ASF dual-hosted git repository.

ibella pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/accumulo-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new cb67d19  Jekyll build from master:915b78b
cb67d19 is described below

commit cb67d19452608db6a47f2ad25f5bcbf06fe14af8
Author: Ivan Bella <ivan@bella.name>
AuthorDate: Tue Jun 11 17:06:55 2019 -0400

    Jekyll build from master:915b78b
    
    fixes #183: Added a chapter on yielding and fixed pseudocode (#185)
    
    * fixes #183: Added a chapter on yielding and fixed pseudocode
---
 docs/2.x/development/iterators.html | 46 +++++++++++++++++++++++++++++--------
 feed.xml                            |  4 ++--
 redirects.json                      |  2 +-
 search_data.json                    |  2 +-
 4 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/docs/2.x/development/iterators.html b/docs/2.x/development/iterators.html
index af17359..81d4bbe 100644
--- a/docs/2.x/development/iterators.html
+++ b/docs/2.x/development/iterators.html
@@ -424,7 +424,7 @@
 that allow users to implement custom retrieval or computational purpose within Accumulo TabletServers.
 The name rightly
 brings forward similarities to the Java Iterator interface; however, Accumulo Iterators are
more complex than Java
 Iterators. Notably, in addition to the expected methods to retrieve the current element and
advance to the next element
-in the iteration, Accumulo Iterators must also support the ability to “move” (<code
class="highlighter-rouge">seek</code>) to an specified point in the
+in the iteration, Accumulo Iterators must also support the ability to “move” (<code
class="highlighter-rouge">seek</code>) to a specified point in the
 iteration (the Accumulo table). Accumulo Iterators are designed to be concatenated together,
similar to applying a
 series of transformations to a list of elements. Accumulo Iterators can duplicate their underlying
source to create
 multiple “pointers” over the same underlying data (which is extremely powerful since
each stream is sorted) or they can
@@ -434,7 +434,7 @@ are not designed to act as triggers nor are they designed to operate outside
of
 
 <p>Understanding how TabletServers invoke the methods on a <a href="https://static.javadoc.io/org.apache.accumulo/accumulo-core/2.0.0-alpha-2/org/apache/accumulo/core/iterators/SortedKeyValueIterator.html">SortedKeyValueIterator</a>
can be obtuse as the actual code is
 buried within the implementation of the TabletServer; however, it is generally unnecessary
to have a strong
-understanding of this as the interface provides clear definitions about what each action
each method should take. This
+understanding of this as the interface provides clear definitions about what each method
should take. This
 chapter aims to provide a more detailed description of how Iterators are invoked, some best
practices and some common
 pitfalls.</p>
 
@@ -587,6 +587,24 @@ early programming assignments which implement their own tree data structures.
<c
 copy on its sources (the children), copies itself, attaches the copies of the children, and
 then returns itself.</p>
 
+<h2 id="yielding-interface">Yielding Interface</h2>
+
+<p>If you have implemented an iterator with a next or seek call that can take a very
long time
+resulting in starving out other scans within the same thread pool, try implementing the
+optional YieldingKeyValueIterator interface which SortedKeyValueIterator extends.</p>
+
+<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span
class="k">default</span> <span class="kt">void</span> <span class="nf">enableYielding</span><span
class="o">(</span><span class="n">YieldCallback</span> <span class="n">callback</span><span
class="o">)</span> <span class="o">{</span> <span class="o">}</span>
+</code></pre></div></div>
+
+<h3 id="enableyielding">enableYielding</h3>
+
+<p>The implementation of this method should simply cache the supplied callback as a
member of
+the iterator. Then one can call the yield(Key key) method on the callback within a next or
+seek call when the iterator is to yield control.  The supplied key will be used as the
+start key in a follow-on seek call’s range allowing the iterator to continue where it left
+off. Note when an iterator yields, the hasTop() method must return false.  Also note that
+the enableYielding method will not be called in isolation mode.</p>
+
 <h2 id="tabletserver-invocation-of-iterators">TabletServer invocation of Iterators</h2>
 
 <p>The following code is a general outline for how TabletServers invoke Iterators.</p>
@@ -603,21 +621,34 @@ then returns itself.</p>
         <span class="n">source</span> <span class="o">=</span> <span
class="n">iter</span><span class="o">;</span>
     <span class="o">}</span>
 
-    <span class="c1">// read a batch of data to return to client</span>
+    <span class="c1">// read a batch of data to return to client from</span>
     <span class="c1">// the last iterator, the "top"</span>
     <span class="n">SortedKeyValueIterator</span> <span class="n">topIter</span>
<span class="o">=</span> <span class="n">source</span><span class="o">;</span>
-    <span class="n">topIter</span><span class="o">.</span><span
class="na">seek</span><span class="o">(</span><span class="n">getRangeFromUser</span><span
class="o">(),</span> <span class="o">...)</span>
+
+    <span class="n">YieldCallback</span> <span class="n">cb</span>
<span class="o">=</span> <span class="k">new</span> <span class="n">YieldCallback</span><span
class="o">();</span>
+    <span class="n">topIter</span><span class="o">.</span><span
class="na">enableYielding</span><span class="o">(</span><span class="n">cb</span><span
class="o">)</span>
+
+    <span class="n">topIter</span><span class="o">.</span><span
class="na">seek</span><span class="o">(</span><span class="n">range</span><span
class="o">,</span> <span class="o">...)</span>
 
     <span class="k">while</span> <span class="o">(</span><span
class="n">topIter</span><span class="o">.</span><span class="na">hasTop</span><span
class="o">()</span> <span class="o">&amp;&amp;</span> <span
class="o">!</span><span class="n">overSizeLimit</span><span class="o">(</span><span
class="n">batch</span><span class="o">))</span> <span class="o">{</span>
         <span class="n">key</span> <span class="o">=</span> <span
class="n">topIter</span><span class="o">.</span><span class="na">getTopKey</span><span
class="o">()</span>
         <span class="n">val</span> <span class="o">=</span> <span
class="n">topIter</span><span class="o">.</span><span class="na">getTopValue</span><span
class="o">()</span>
         <span class="n">batch</span><span class="o">.</span><span
class="na">add</span><span class="o">(</span><span class="k">new</span>
<span class="n">KeyValue</span><span class="o">(</span><span class="n">key</span><span
class="o">,</span> <span class="n">val</span><span class="o">)</span>
+        <span class="c1">// remember the last key returned</span>
+        <span class="n">setLastKeyReturned</span><span class="o">(</span><span
class="n">key</span><span class="o">);</span>
         <span class="k">if</span> <span class="o">(</span><span
class="n">systemDataSourcesChanged</span><span class="o">())</span> <span
class="o">{</span>
             <span class="c1">// code does not show isolation case, which will</span>
             <span class="c1">// keep using same data sources until a row boundary is
hit</span>
             <span class="n">range</span> <span class="o">=</span>
<span class="k">new</span> <span class="n">Range</span><span class="o">(</span><span
class="n">key</span><span class="o">,</span> <span class="kc">false</span><span
class="o">,</span> <span class="n">range</span><span class="o">.</span><span
class="na">endKey</span><span class="o">(),</span> <span class="n">range</span><span
class="o">.</span><span class="na">endKeyInclusive</span><span class="o">());</span>
             <span class="k">break</span><span class="o">;</span>
         <span class="o">}</span>
+        <span class="n">topIter</span><span class="o">.</span><span
class="na">next</span><span class="o">()</span>
+    <span class="o">}</span>
+
+    <span class="k">if</span> <span class="o">(</span><span class="n">cb</span><span
class="o">.</span><span class="na">hasYielded</span><span class="o">())</span>
<span class="o">{</span>
+        <span class="c1">// remember the yield key as the last key returned</span>
+        <span class="n">setLastKeyReturned</span><span class="o">(</span><span
class="n">cb</span><span class="o">.</span><span class="na">getKey</span><span
class="o">());</span>
+        <span class="k">break</span><span class="o">;</span>
     <span class="o">}</span>
 <span class="o">}</span>
 <span class="c1">//return batch of key values to client</span>
@@ -628,15 +659,12 @@ then returns itself.</p>
 <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span
class="c1">// Given the above</span>
 <span class="n">List</span><span class="o">&lt;</span><span
class="n">KeyValue</span><span class="o">&gt;</span> <span class="n">batch</span>
<span class="o">=</span> <span class="n">getNextBatch</span><span
class="o">();</span>
 
-<span class="c1">// Store off lastKeyReturned for this client</span>
-<span class="n">lastKeyReturned</span> <span class="o">=</span> <span
class="n">batch</span><span class="o">.</span><span class="na">get</span><span
class="o">(</span><span class="n">batch</span><span class="o">.</span><span
class="na">size</span><span class="o">()</span> <span class="o">-</span>
<span class="mi">1</span><span class="o">).</span><span class="na">getKey</span><span
class="o">();</span>
-
 <span class="c1">// thread goes away (client stops asking for the next batch).</span>
 
 <span class="c1">// Eventually client comes back</span>
 <span class="c1">// Setup as before...</span>
-<span class="n">Range</span> <span class="n">userRange</span> <span
class="o">=</span> <span class="n">getRangeFromUser</span><span class="o">();</span>
-<span class="n">Range</span> <span class="n">actualRange</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">Range</span><span
class="o">(</span><span class="n">lastKeyReturned</span><span class="o">,</span>
<span class="kc">false</span><span class="o">,</span> <span class="n">userRange</span><span
class="o">.</span><span class="na">getEndKey</span><span class="o">(),</span>
<span class="n">userRange</span><span class="o">.</span><span class="na">isEndKeyInclusive<
[...]
+<span class="n">Range</span> <span class="n">userRange</span> <span
class="o">=</span> <span class="n">getRangeFromClient</span><span
class="o">();</span>
+<span class="n">Range</span> <span class="n">actualRange</span> <span
class="o">=</span> <span class="k">new</span> <span class="n">Range</span><span
class="o">(</span><span class="n">getLastKeyReturned</span><span class="o">(),</span>
<span class="kc">false</span><span class="o">,</span> <span class="n">userRange</span><span
class="o">.</span><span class="na">getEndKey</span><span class="o">(),</span>
<span class="n">userRange</span><span class="o">.</span><span class="na">isEndKeyInclu
[...]
 
 <span class="c1">// Use the actualRange, not the user provided one</span>
 <span class="n">topIter</span><span class="o">.</span><span class="na">seek</span><span
class="o">(</span><span class="n">actualRange</span><span class="o">);</span>
diff --git a/feed.xml b/feed.xml
index 684e00f..33c67e7 100644
--- a/feed.xml
+++ b/feed.xml
@@ -6,8 +6,8 @@
 </description>
     <link>https://accumulo.apache.org/</link>
     <atom:link href="https://accumulo.apache.org/feed.xml" rel="self" type="application/rss+xml"/>
-    <pubDate>Tue, 11 Jun 2019 16:34:36 -0400</pubDate>
-    <lastBuildDate>Tue, 11 Jun 2019 16:34:36 -0400</lastBuildDate>
+    <pubDate>Tue, 11 Jun 2019 17:06:47 -0400</pubDate>
+    <lastBuildDate>Tue, 11 Jun 2019 17:06:47 -0400</lastBuildDate>
     <generator>Jekyll v3.8.5</generator>
     
     
diff --git a/redirects.json b/redirects.json
index 9c19363..2d3a54c 100644
--- a/redirects.json
+++ b/redirects.json
@@ -1 +1 @@
-{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
+{"/release_notes/1.5.1.html":"https://accumulo.apache.org/release/accumulo-1.5.1/","/release_notes/1.6.0.html":"https://accumulo.apache.org/release/accumulo-1.6.0/","/release_notes/1.6.1.html":"https://accumulo.apache.org/release/accumulo-1.6.1/","/release_notes/1.6.2.html":"https://accumulo.apache.org/release/accumulo-1.6.2/","/release_notes/1.7.0.html":"https://accumulo.apache.org/release/accumulo-1.7.0/","/release_notes/1.5.3.html":"https://accumulo.apache.org/release/accumulo-1.5.3/"
[...]
\ No newline at end of file
diff --git a/search_data.json b/search_data.json
index aee0418..10370fe 100644
--- a/search_data.json
+++ b/search_data.json
@@ -100,7 +100,7 @@
   
     "docs-2-x-development-iterators": {
       "title": "Iterators",
-      "content"	 : "Accumulo SortedKeyValueIterators, commonly referred to as Iterators for
short, are server-side programming constructsthat allow users to implement custom retrieval
or computational purpose within Accumulo TabletServers.  The name rightlybrings forward similarities
to the Java Iterator interface; however, Accumulo Iterators are more complex than JavaIterators.
Notably, in addition to the expected methods to retrieve the current element and advance to
the next elementin [...]
+      "content"	 : "Accumulo SortedKeyValueIterators, commonly referred to as Iterators for
short, are server-side programming constructsthat allow users to implement custom retrieval
or computational purpose within Accumulo TabletServers.  The name rightlybrings forward similarities
to the Java Iterator interface; however, Accumulo Iterators are more complex than JavaIterators.
Notably, in addition to the expected methods to retrieve the current element and advance to
the next elementin [...]
       "url": " /docs/2.x/development/iterators",
       "categories": "development"
     },


Mime
View raw message