accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r904956 - in /websites/staging/accumulo/trunk/content: ./ release_notes/1.6.0.html
Date Fri, 04 Apr 2014 21:08:52 GMT
Author: buildbot
Date: Fri Apr  4 21:08:52 2014
New Revision: 904956

Log:
Staging update by buildbot for accumulo

Modified:
    websites/staging/accumulo/trunk/content/   (props changed)
    websites/staging/accumulo/trunk/content/release_notes/1.6.0.html

Propchange: websites/staging/accumulo/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Apr  4 21:08:52 2014
@@ -1 +1 @@
-1584908
+1584912

Modified: websites/staging/accumulo/trunk/content/release_notes/1.6.0.html
==============================================================================
--- websites/staging/accumulo/trunk/content/release_notes/1.6.0.html (original)
+++ websites/staging/accumulo/trunk/content/release_notes/1.6.0.html Fri Apr  4 21:08:52 2014
@@ -114,15 +114,17 @@
 <p>One of the key elements of the Big Table design is use of the Log Structured Merge
Tree (LSMT) concept.  This entails sorting data in memory, writing out sorted files, and then
later merging multiple sorted files into a single file.   These automatic merges happen in
the background and Accumulo decides when to merge files based comparing relative sizes of
files to a compaction ratio.  Adjusting the compaction ratio is the only way a user can control
this process.  <a href="https://issues.apache.org/jira/browse/ACCUMULO-1451" title="Make
Compaction triggers extensible">ACCUMULO-1451</a> introduces pluggable compaction
strategies which allow users to choose when and what files to compact.  <a href="https://issues.apache.org/jira/browse/ACCUMULO-1808"
title="Create compaction strategy that has size limit">ACCUMULO-1808</a> adds a compaction
strategy the prevents compaction of files over a configurable size.</p>
 <h3 id="lexicoders">Lexicoders</h3>
 <p>Accumulo only sorts data lexicographically.  Getting something like a pair of (<string>,<integer>)
to sort correctly in Accumulo is tricky.  Its tricky because you only want to compare the
integers if the strings are equal.  Its possible to make this sort properly in Accumulo if
the data is encoded properly, but that's the tricky part.  To make this easier <a href="https://issues.apache.org/jira/browse/ACCUMULO-1336"
title="Add lexicoders from Typo to Accumulo">ACCUMULO-1336</a> added Lexicoders to
the Accumulo API.  Lexicoders provide an easy way to serialize data so that it sorts properly
lexicographically.  Below is a simple example.</p>
-<blockquote>
-<p>PairLexicoder plex = new PairLexicoder(new StringLexicoder(), new IntegerLexicoder());
-byte[] ba1 = plex.encode(new ComparablePair<String, Integer>("b",1));
-byte[] ba2 = plex.encode(new ComparablePair<String, Integer>("aa",1));
-byte[] ba3 = plex.encode(new ComparablePair<String, Integer>("a",2));
-byte[] ba4 = plex.encode(new ComparablePair<String, Integer>("a",1)); 
-byte[] ba5 = plex.encode(new ComparablePair<String, Integer>("aa",-3));</p>
-<p>//sorting ba1,ba2,ba3,ba4, and ba5 lexicographically will result in the same order
as sorting the ComparablePairs</p>
-</blockquote>
+<div class="codehilite"><pre>   <span class="n">PairLexicoder</span>
<span class="n">plex</span> <span class="p">=</span> <span class="n">new</span>
<span class="n">PairLexicoder</span><span class="p">(</span><span
class="n">new</span> <span class="n">StringLexicoder</span><span class="p">(),</span>
<span class="n">new</span> <span class="n">IntegerLexicoder</span><span
class="p">());</span>
+   <span class="n">byte</span><span class="p">[]</span> <span
class="n">ba1</span> <span class="p">=</span> <span class="n">plex</span><span
class="p">.</span><span class="n">encode</span><span class="p">(</span><span
class="n">new</span> <span class="n">ComparablePair</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span
class="o">&gt;</span><span class="p">(</span>&quot;<span class="n">b</span>&quot;<span
class="p">,</span>1<span class="p">));</span>
+   <span class="n">byte</span><span class="p">[]</span> <span
class="n">ba2</span> <span class="p">=</span> <span class="n">plex</span><span
class="p">.</span><span class="n">encode</span><span class="p">(</span><span
class="n">new</span> <span class="n">ComparablePair</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span
class="o">&gt;</span><span class="p">(</span>&quot;<span class="n">aa</span>&quot;<span
class="p">,</span>1<span class="p">));</span>
+   <span class="n">byte</span><span class="p">[]</span> <span
class="n">ba3</span> <span class="p">=</span> <span class="n">plex</span><span
class="p">.</span><span class="n">encode</span><span class="p">(</span><span
class="n">new</span> <span class="n">ComparablePair</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span
class="o">&gt;</span><span class="p">(</span>&quot;<span class="n">a</span>&quot;<span
class="p">,</span>2<span class="p">));</span>
+   <span class="n">byte</span><span class="p">[]</span> <span
class="n">ba4</span> <span class="p">=</span> <span class="n">plex</span><span
class="p">.</span><span class="n">encode</span><span class="p">(</span><span
class="n">new</span> <span class="n">ComparablePair</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span
class="o">&gt;</span><span class="p">(</span>&quot;<span class="n">a</span>&quot;<span
class="p">,</span>1<span class="p">));</span> 
+   <span class="n">byte</span><span class="p">[]</span> <span
class="n">ba5</span> <span class="p">=</span> <span class="n">plex</span><span
class="p">.</span><span class="n">encode</span><span class="p">(</span><span
class="n">new</span> <span class="n">ComparablePair</span><span class="o">&lt;</span><span
class="n">String</span><span class="p">,</span> <span class="n">Integer</span><span
class="o">&gt;</span><span class="p">(</span>&quot;<span class="n">aa</span>&quot;<span
class="p">,</span><span class="o">-</span>3<span class="p">));</span>
+
+   <span class="o">//</span><span class="n">sorting</span> <span
class="n">ba1</span><span class="p">,</span><span class="n">ba2</span><span
class="p">,</span><span class="n">ba3</span><span class="p">,</span><span
class="n">ba4</span><span class="p">,</span> <span class="n">and</span>
<span class="n">ba5</span> <span class="n">lexicographically</span>
<span class="n">will</span> <span class="n">result</span> <span
class="n">in</span> <span class="n">the</span> <span class="n">same</span>
<span class="n">order</span> <span class="n">as</span> <span class="n">sorting</span>
<span class="n">the</span> <span class="n">ComparablePairs</span>
+</pre></div>
+
+
 <h3 id="multi-table-accumulo-input-format">Multi-table Accumulo input format</h3>
 <p><a href="https://issues.apache.org/jira/browse/ACCUMULO-391" title="Multi-table
input format">ACCUMULO-391</a> makes it possible to easily read from multiple tables
in a Map Reduce job.  TODO is there more to say about this, if not maybe move to one-liners.</p>
 <h3 id="locality-groups-in-memory">Locality groups in memory</h3>



Mime
View raw message