accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bil...@apache.org
Subject svn commit: r1208095 - in /incubator/accumulo/branches/1.4: docs/examples/ src/core/src/main/java/org/apache/accumulo/core/iterators/ src/core/src/main/java/org/apache/accumulo/core/iterators/user/ src/core/src/test/java/org/apache/accumulo/core/iterat...
Date Tue, 29 Nov 2011 21:54:24 GMT
Author: billie
Date: Tue Nov 29 21:54:22 2011
New Revision: 1208095

URL: http://svn.apache.org/viewvc?rev=1208095&view=rev
Log:
ACCUMULO-167 additional work on intersecting iterators and combiners

Added:
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IndexedDocIterator.java
      - copied, changed from r1207970, incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IndexedDocIteratorTest.java
      - copied, changed from r1208023, incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/FamilyIntersectingIteratorTest.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest.java
      - copied, changed from r1208023, incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/IntersectingIteratorTest.java
Removed:
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/FamilyIntersectingIteratorTest.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/IntersectingIteratorTest.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest2.java
    incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/FileCountMR.java
    incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/StringArraySummation.java
Modified:
    incubator/accumulo/branches/1.4/docs/examples/README.dirlist
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java
    incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java
    incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/VersioningIteratorTest.java
    incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java
    incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java
    incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/shard/Query.java
    incubator/accumulo/branches/1.4/src/examples/src/test/java/org/apache/accumulo/examples/dirlist/CountTest.java

Modified: incubator/accumulo/branches/1.4/docs/examples/README.dirlist
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/docs/examples/README.dirlist?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/docs/examples/README.dirlist (original)
+++ incubator/accumulo/branches/1.4/docs/examples/README.dirlist Tue Nov 29 21:54:22 2011
@@ -24,12 +24,10 @@ This example stores filesystem informati
 
 This example shows how to use Accumulo to store a file system history.  It has the following classes:
 
- * Ingest.java - Recursively lists the files and directories under a given path, ingests their names and file info (not the file data!) into a Accumulo table, and indexes the file names in a separate table.
+ * Ingest.java - Recursively lists the files and directories under a given path, ingests their names and file info (not the file data!) into an Accumulo table, and indexes the file names in a separate table.
  * QueryUtil.java - Provides utility methods for getting the info for a file, listing the contents of a directory, and performing single wild card searches on file or directory names.
  * Viewer.java - Provides a GUI for browsing the file system information stored in Accumulo.
- * FileCountMR.java - Runs MR over the file system information and writes out recursive counts to a Accumulo table.
- * FileCount.java - Accomplishes the same thing as FileCountMR, but in a different way.  Computes recursive counts and stores them back into table.
- * StringArraySummation.java - Aggregates counts for the FileCountMR reducer.
+ * FileCount.java - Computes recursive counts over file system information and stores them back into the same Accumulo table.
  
 To begin, ingest some data with Ingest.java.
 
@@ -56,12 +54,10 @@ To perform searches on file or directory
     $ ./bin/accumulo org.apache.accumulo.examples.dirlist.QueryUtil instance zookeepers username password indexTable exampleVis '*jar' -search
     $ ./bin/accumulo org.apache.accumulo.examples.dirlist.QueryUtil instance zookeepers username password indexTable exampleVis filename*jar -search
 
-To count the number of direct children (directories and files) and descendants (children and children's descendents, directories and files), run the FileCountMR over the dirTable table.
-The results can be written back to the same table.
+To count the number of direct children (directories and files) and descendants (children and children's descendents, directories and files), run the FileCount over the dirTable table.
+The results are written back to the same table.
 
-    $ ./bin/tool.sh lib/accumulo-examples-*[^c].jar org.apache.accumulo.examples.dirlist.FileCountMR instance zookeepers username password dirTable dirTable exampleVis exampleVis
-
-Alternatively, you can run FileCount.java which performs the same counts but is not a MapReduce.  FileCount will be faster for small data sets.
+    $ ./bin/accumulo org.apache.accumulo.examples.dirlist.FileCount instance zookeepers username password dirTable exampleVis exampleVis
 
 ## Directory Table
 

Modified: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java Tue Nov 29 21:54:22 2011
@@ -1,4 +1,4 @@
-/*
+/**
  * Licensed to the Apache Software Foundation (ASF) under one or more
  * contributor license agreements.  See the NOTICE file distributed with
  * this work for additional information regarding copyright ownership.
@@ -16,160 +16,14 @@
  */
 package org.apache.accumulo.core.iterators;
 
-import java.io.IOException;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.Map;
-import java.util.Set;
-
-import org.apache.accumulo.core.data.ArrayByteSequence;
-import org.apache.accumulo.core.data.ByteSequence;
-import org.apache.accumulo.core.data.Key;
-import org.apache.accumulo.core.data.PartialKey;
-import org.apache.accumulo.core.data.Range;
-import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.core.iterators.user.IntersectingIterator;
-import org.apache.hadoop.io.Text;
+import org.apache.accumulo.core.iterators.user.IndexedDocIterator;
 
 /**
- * This iterator facilitates document-partitioned indexing. It is an example of extending the IntersectingIterator to customize the placement of the term and
- * docID. It expects a table structure of the following form:
- * 
- * row: shardID, colfam: docColf\0type, colqual: docID, value: doc
- * 
- * row: shardID, colfam: indexColf, colqual: term\0type\0docID\0info, value: (empty)
- * 
- * When you configure this iterator with a set of terms, it will return only the docIDs and docs that appear with all of the specified terms. The result will
- * have the following form:
+ * This class remains here for backwards compatibility.
  * 
- * row: shardID, colfam: indexColf, colqual: type\0docID\0info, value: doc
- * 
- * This iterator is commonly used with BatchScanner or AccumuloInputFormat, to parallelize the search over all shardIDs.
+ * @deprecated since 1.4
+ * @see org.apache.accumulo.core.iterators.user.IndexedDocIterator
  */
-public class FamilyIntersectingIterator extends IntersectingIterator {
-  public static final Text DEFAULT_INDEX_COLF = new Text("i");
-  public static final Text DEFAULT_DOC_COLF = new Text("e");
-  
-  public static final String indexFamilyOptionName = "indexFamily";
-  public static final String docFamilyOptionName = "docFamily";
-  
-  private static Text indexColf = DEFAULT_INDEX_COLF;
-  private static Text docColf = DEFAULT_DOC_COLF;
-  private static Set<ByteSequence> indexColfSet;
-  private static Set<ByteSequence> docColfSet;
-  
-  private static final byte[] nullByte = {0};
-  
-  public SortedKeyValueIterator<Key,Value> docSource;
-  
-  @Override
-  protected Key buildKey(Text partition, Text term, Text docID) {
-    Text colq = new Text(term);
-    colq.append(nullByte, 0, 1);
-    colq.append(docID.getBytes(), 0, docID.getLength());
-    colq.append(nullByte, 0, 1);
-    return new Key(partition, indexColf, colq);
-  }
-  
-  @Override
-  protected Key buildKey(Text partition, Text term) {
-    Text colq = new Text(term);
-    return new Key(partition, indexColf, colq);
-  }
-  
-  @Override
-  protected Text getDocID(Key key) {
-    Text colq = key.getColumnQualifier();
-    int firstZeroIndex = colq.find("\0");
-    if (firstZeroIndex < 0) {
-      throw new IllegalArgumentException("bad docid: " + key.toString());
-    }
-    int secondZeroIndex = colq.find("\0", firstZeroIndex + 1);
-    if (secondZeroIndex < 0) {
-      throw new IllegalArgumentException("bad docid: " + key.toString());
-    }
-    int thirdZeroIndex = colq.find("\0", secondZeroIndex + 1);
-    if (thirdZeroIndex < 0) {
-      throw new IllegalArgumentException("bad docid: " + key.toString());
-    }
-    Text docID = new Text();
-    try {
-      docID.set(colq.getBytes(), firstZeroIndex + 1, thirdZeroIndex - 1 - firstZeroIndex);
-    } catch (ArrayIndexOutOfBoundsException e) {
-      throw new IllegalArgumentException("bad indices for docid: " + key.toString() + " " + firstZeroIndex + " " + secondZeroIndex + " " + thirdZeroIndex);
-    }
-    return docID;
-  }
-  
-  @Override
-  protected Text getTerm(Key key) {
-    if (indexColf.compareTo(key.getColumnFamily().getBytes(), 0, indexColf.getLength()) < 0) {
-      // We're past the index column family, so return a term that will sort lexicographically last.
-      // The last unicode character should suffice
-      return new Text("\uFFFD");
-    }
-    Text colq = key.getColumnQualifier();
-    int zeroIndex = colq.find("\0");
-    Text term = new Text();
-    term.set(colq.getBytes(), 0, zeroIndex);
-    return term;
-  }
-  
-  @Override
-  public void init(SortedKeyValueIterator<Key,Value> source, Map<String,String> options, IteratorEnvironment env) throws IOException {
-    super.init(source, options, env);
-    if (options.containsKey(indexFamilyOptionName))
-      indexColf = new Text(options.get(indexFamilyOptionName));
-    if (options.containsKey(docFamilyOptionName))
-      docColf = new Text(options.get(docFamilyOptionName));
-    docSource = source.deepCopy(env);
-    indexColfSet = Collections.singleton((ByteSequence) new ArrayByteSequence(indexColf.getBytes(), 0, indexColf.getLength()));
-  }
-  
-  @Override
-  public SortedKeyValueIterator<Key,Value> deepCopy(IteratorEnvironment env) {
-    throw new UnsupportedOperationException();
-  }
-  
-  @Override
-  public void seek(Range range, Collection<ByteSequence> seekColumnFamilies, boolean inclusive) throws IOException {
-    super.seek(range, indexColfSet, true);
-    
-  }
-  
-  @Override
-  protected void advanceToIntersection() throws IOException {
-    super.advanceToIntersection();
-    if (topKey == null)
-      return;
-    if (log.isTraceEnabled())
-      log.trace("using top key to seek for doc: " + topKey.toString());
-    Key docKey = buildDocKey();
-    docSource.seek(new Range(docKey, true, null, false), docColfSet, true);
-    log.debug("got doc key: " + docSource.getTopKey().toString());
-    if (docSource.hasTop() && docKey.compareTo(docSource.getTopKey(), PartialKey.ROW_COLFAM_COLQUAL) == 0) {
-      value = docSource.getTopValue();
-    }
-    log.debug("got doc value: " + value.toString());
-  }
+public class FamilyIntersectingIterator extends IndexedDocIterator {
   
-  protected Key buildDocKey() {
-    if (log.isTraceEnabled())
-      log.trace("building doc key for " + currentPartition + " " + currentDocID);
-    int zeroIndex = currentDocID.find("\0");
-    if (zeroIndex < 0)
-      throw new IllegalArgumentException("bad current docID");
-    Text colf = new Text(docColf);
-    colf.append(nullByte, 0, 1);
-    colf.append(currentDocID.getBytes(), 0, zeroIndex);
-    docColfSet = Collections.singleton((ByteSequence) new ArrayByteSequence(colf.getBytes(), 0, colf.getLength()));
-    if (log.isTraceEnabled())
-      log.trace(zeroIndex + " " + currentDocID.getLength());
-    Text colq = new Text();
-    colq.set(currentDocID.getBytes(), zeroIndex + 1, currentDocID.getLength() - zeroIndex - 1);
-    Key k = new Key(currentPartition, colf, colq);
-    if (log.isTraceEnabled())
-      log.trace("built doc key for seek: " + k.toString());
-    return k;
-  }
 }

Modified: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java Tue Nov 29 21:54:22 2011
@@ -40,6 +40,10 @@ import org.apache.hadoop.io.WritableUtil
  * VARNUM, LONG, and STRING which indicate the VarNumEncoder, LongEncoder, and StringEncoder respectively.
  */
 public abstract class LongCombiner extends TypedValueCombiner<Long> {
+  public static final Encoder<Long> FIXED_LEN_ENCODER = new FixedLenEncoder();
+  public static final Encoder<Long> VAR_LEN_ENCODER = new VarLenEncoder();
+  public static final Encoder<Long> STRING_ENCODER = new StringEncoder();
+  
   protected static final String TYPE = "type";
   protected static final String CLASS_PREFIX = "class:";
   
@@ -73,17 +77,16 @@ public abstract class LongCombiner exten
       } catch (IllegalAccessException e) {
         throw new IllegalArgumentException(e);
       }
-    }
-    else {
+    } else {
       switch (Type.valueOf(type)) {
         case VARNUM:
-          encoder = new VarNumEncoder();
+          encoder = VAR_LEN_ENCODER;
           return;
         case LONG:
-          encoder = new LongEncoder();
+          encoder = FIXED_LEN_ENCODER;
           return;
         case STRING:
-          encoder = new StringEncoder();
+          encoder = STRING_ENCODER;
           return;
         default:
           throw new IllegalArgumentException();
@@ -110,7 +113,7 @@ public abstract class LongCombiner exten
   /**
    * An Encoder that uses a variable-length encoding for Longs. It uses WritableUtils.writeVLong and WritableUtils.readVLong for encoding and decoding.
    */
-  public static class VarNumEncoder implements Encoder<Long> {
+  public static class VarLenEncoder implements Encoder<Long> {
     @Override
     public byte[] encode(Long v) {
       ByteArrayOutputStream baos = new ByteArrayOutputStream();
@@ -139,7 +142,7 @@ public abstract class LongCombiner exten
   /**
    * An Encoder that uses an 8-byte encoding for Longs.
    */
-  public static class LongEncoder implements Encoder<Long> {
+  public static class FixedLenEncoder implements Encoder<Long> {
     @Override
     public byte[] encode(Long l) {
       byte[] b = new byte[8];

Copied: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IndexedDocIterator.java (from r1207970, incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java)
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IndexedDocIterator.java?p2=incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IndexedDocIterator.java&p1=incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java&r1=1207970&r2=1208095&rev=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/FamilyIntersectingIterator.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IndexedDocIterator.java Tue Nov 29 21:54:22 2011
@@ -14,7 +14,7 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.accumulo.core.iterators;
+package org.apache.accumulo.core.iterators.user;
 
 import java.io.IOException;
 import java.util.Collection;
@@ -22,36 +22,42 @@ import java.util.Collections;
 import java.util.Map;
 import java.util.Set;
 
+import org.apache.accumulo.core.client.IteratorSetting;
 import org.apache.accumulo.core.data.ArrayByteSequence;
 import org.apache.accumulo.core.data.ByteSequence;
 import org.apache.accumulo.core.data.Key;
 import org.apache.accumulo.core.data.PartialKey;
 import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.core.iterators.user.IntersectingIterator;
+import org.apache.accumulo.core.iterators.IteratorEnvironment;
+import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
 import org.apache.hadoop.io.Text;
 
 /**
  * This iterator facilitates document-partitioned indexing. It is an example of extending the IntersectingIterator to customize the placement of the term and
- * docID. It expects a table structure of the following form:
+ * docID. As with the IntersectingIterator, documents are grouped together and indexed into a single row of an Accumulo table. This allows a tablet server to
+ * perform boolean AND operations on terms in the index. This iterator also stores the document contents in a separate column family in the same row so that the
+ * full document can be returned with each query.
  * 
- * row: shardID, colfam: docColf\0type, colqual: docID, value: doc
+ * The table structure should have the following form:
  * 
- * row: shardID, colfam: indexColf, colqual: term\0type\0docID\0info, value: (empty)
+ * row: shardID, colfam: docColf\0doctype, colqual: docID, value: doc
+ * 
+ * row: shardID, colfam: indexColf, colqual: term\0doctype\0docID\0info, value: (empty)
  * 
  * When you configure this iterator with a set of terms, it will return only the docIDs and docs that appear with all of the specified terms. The result will
  * have the following form:
  * 
- * row: shardID, colfam: indexColf, colqual: type\0docID\0info, value: doc
+ * row: shardID, colfam: indexColf, colqual: doctype\0docID\0info, value: doc
  * 
  * This iterator is commonly used with BatchScanner or AccumuloInputFormat, to parallelize the search over all shardIDs.
  */
-public class FamilyIntersectingIterator extends IntersectingIterator {
+public class IndexedDocIterator extends IntersectingIterator {
   public static final Text DEFAULT_INDEX_COLF = new Text("i");
   public static final Text DEFAULT_DOC_COLF = new Text("e");
   
-  public static final String indexFamilyOptionName = "indexFamily";
-  public static final String docFamilyOptionName = "docFamily";
+  private static final String indexFamilyOptionName = "indexFamily";
+  private static final String docFamilyOptionName = "docFamily";
   
   private static Text indexColf = DEFAULT_INDEX_COLF;
   private static Text docColf = DEFAULT_DOC_COLF;
@@ -79,6 +85,10 @@ public class FamilyIntersectingIterator 
   
   @Override
   protected Text getDocID(Key key) {
+    return parseDocID(key);
+  }
+  
+  public static Text parseDocID(Key key) {
     Text colq = key.getColumnQualifier();
     int firstZeroIndex = colq.find("\0");
     if (firstZeroIndex < 0) {
@@ -172,4 +182,43 @@ public class FamilyIntersectingIterator 
       log.trace("built doc key for seek: " + k.toString());
     return k;
   }
+  
+  /**
+   * A convenience method for setting the index column family.
+   * 
+   * @param is
+   *          IteratorSetting object to configure.
+   * @param indexColf
+   *          the index column family
+   */
+  public static void setIndexColf(IteratorSetting is, String indexColf) {
+    is.addOption(indexFamilyOptionName, indexColf);
+  }
+  
+  /**
+   * A convenience method for setting the document column family prefix.
+   * 
+   * @param is
+   *          IteratorSetting object to configure.
+   * @param docColfPrefix
+   *          the prefix of the document column family (colf will be of the form docColfPrefix\0doctype)
+   */
+  public static void setDocColfPrefix(IteratorSetting is, String docColfPrefix) {
+    is.addOption(docFamilyOptionName, docColfPrefix);
+  }
+  
+  /**
+   * A convenience method for setting the index column family and document column family prefix.
+   * 
+   * @param is
+   *          IteratorSetting object to configure.
+   * @param indexColf
+   *          the index column family
+   * @param docColfPrefix
+   *          the prefix of the document column family (colf will be of the form docColfPrefix\0doctype)
+   */
+  public static void setColfs(IteratorSetting is, String indexColf, String docColfPrefix) {
+    setIndexColf(is, indexColf);
+    setDocColfPrefix(is, docColfPrefix);
+  }
 }

Modified: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java Tue Nov 29 21:54:22 2011
@@ -34,7 +34,10 @@ import org.apache.hadoop.io.Text;
 import org.apache.log4j.Logger;
 
 /**
- * This iterator facilitates document-partitioned indexing. It expects a table structure of the following form:
+ * This iterator facilitates document-partitioned indexing. It involves grouping a set of documents together and indexing those documents into a single row of
+ * an Accumulo table. This allows a tablet server to perform boolean AND operations on terms in the index.
+ * 
+ * The table structure should have the following form:
  * 
  * row: shardID, colfam: term, colqual: docID
  * 
@@ -375,8 +378,8 @@ public class IntersectingIterator implem
     return "";
   }
   
-  public static final String columnFamiliesOptionName = "columnFamilies";
-  public static final String notFlagOptionName = "notFlag";
+  private static final String columnFamiliesOptionName = "columnFamilies";
+  private static final String notFlagOptionName = "notFlag";
   
   /**
    * to be made protected

Modified: incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java Tue Nov 29 21:54:22 2011
@@ -41,6 +41,10 @@ import org.apache.hadoop.io.WritableUtil
  * A Combiner that interprets Values as arrays of Longs and returns an array of element-wise sums.
  */
 public class SummingArrayCombiner extends TypedValueCombiner<List<Long>> {
+  public static final Encoder<List<Long>> FIXED_LONG_ARRAY_ENCODER = new FixedLongArrayEncoder();
+  public static final Encoder<List<Long>> VAR_LONG_ARRAY_ENCODER = new VarLongArrayEncoder();
+  public static final Encoder<List<Long>> STRING_ARRAY_ENCODER = new StringArrayEncoder();
+  
   private static final String TYPE = "type";
   private static final String CLASS_PREFIX = "class:";
   
@@ -101,10 +105,10 @@ public class SummingArrayCombiner extend
     } else {
       switch (Type.valueOf(options.get(TYPE))) {
         case VARNUM:
-          encoder = new VarNumArrayEncoder();
+          encoder = new VarLongArrayEncoder();
           return;
         case LONG:
-          encoder = new LongArrayEncoder();
+          encoder = new FixedLongArrayEncoder();
           return;
         case STRING:
           encoder = new StringArrayEncoder();
@@ -167,7 +171,7 @@ public class SummingArrayCombiner extend
     }
   }
   
-  public static class VarNumArrayEncoder extends DOSArrayEncoder<Long> {
+  public static class VarLongArrayEncoder extends DOSArrayEncoder<Long> {
     @Override
     public void write(DataOutputStream dos, Long v) throws IOException {
       WritableUtils.writeVLong(dos, v);
@@ -179,7 +183,7 @@ public class SummingArrayCombiner extend
     }
   }
   
-  public static class LongArrayEncoder extends DOSArrayEncoder<Long> {
+  public static class FixedLongArrayEncoder extends DOSArrayEncoder<Long> {
     @Override
     public void write(DataOutputStream dos, Long v) throws IOException {
       dos.writeLong(v);

Modified: incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/CombinerTest.java Tue Nov 29 21:54:22 2011
@@ -36,9 +36,9 @@ import org.apache.accumulo.core.iterator
 import org.apache.accumulo.core.iterators.Combiner.ValueIterator;
 import org.apache.accumulo.core.iterators.DefaultIteratorEnvironment;
 import org.apache.accumulo.core.iterators.LongCombiner;
-import org.apache.accumulo.core.iterators.LongCombiner.LongEncoder;
+import org.apache.accumulo.core.iterators.LongCombiner.FixedLenEncoder;
 import org.apache.accumulo.core.iterators.LongCombiner.StringEncoder;
-import org.apache.accumulo.core.iterators.LongCombiner.VarNumEncoder;
+import org.apache.accumulo.core.iterators.LongCombiner.VarLenEncoder;
 import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
 import org.apache.accumulo.core.iterators.SortedMapIterator;
 import org.apache.accumulo.core.iterators.TypedValueCombiner.Encoder;
@@ -78,13 +78,9 @@ public class CombinerTest {
     return new Text(String.format("r%03d", row));
   }
   
-  Encoder<Long> varNumEncoder = new LongCombiner.VarNumEncoder();
-  Encoder<Long> longEncoder = new LongCombiner.LongEncoder();
-  Encoder<Long> stringEncoder = new LongCombiner.StringEncoder();
-  
   @Test
   public void test1() throws IOException {
-    Encoder<Long> encoder = varNumEncoder;
+    Encoder<Long> encoder = LongCombiner.VAR_LEN_ENCODER;
     
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
@@ -149,7 +145,7 @@ public class CombinerTest {
   
   @Test
   public void test2() throws IOException {
-    Encoder<Long> encoder = varNumEncoder;
+    Encoder<Long> encoder = LongCombiner.VAR_LEN_ENCODER;
     
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
@@ -161,7 +157,7 @@ public class CombinerTest {
     Combiner ai = new SummingCombiner();
     
     IteratorSetting is = new IteratorSetting(1, SummingCombiner.class);
-    LongCombiner.setEncodingType(is, VarNumEncoder.class);
+    LongCombiner.setEncodingType(is, VarLenEncoder.class);
     Combiner.setColumns(is, Collections.singletonList(new IteratorSetting.Column("cf001")));
     
     ai.init(new SortedMapIterator(tm1), is.getProperties(), null);
@@ -211,7 +207,7 @@ public class CombinerTest {
   
   @Test
   public void test3() throws IOException {
-    Encoder<Long> encoder = longEncoder;
+    Encoder<Long> encoder = LongCombiner.FIXED_LEN_ENCODER;
     
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
@@ -227,7 +223,7 @@ public class CombinerTest {
     Combiner ai = new SummingCombiner();
     
     IteratorSetting is = new IteratorSetting(1, SummingCombiner.class);
-    LongCombiner.setEncodingType(is, LongEncoder.class.getName());
+    LongCombiner.setEncodingType(is, FixedLenEncoder.class.getName());
     Combiner.setColumns(is, Collections.singletonList(new IteratorSetting.Column("cf001")));
     
     ai.init(new SortedMapIterator(tm1), is.getProperties(), null);
@@ -277,7 +273,7 @@ public class CombinerTest {
   
   @Test
   public void test4() throws IOException {
-    Encoder<Long> encoder = stringEncoder;
+    Encoder<Long> encoder = LongCombiner.STRING_ENCODER;
     
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
@@ -352,7 +348,7 @@ public class CombinerTest {
   
   @Test
   public void test5() throws IOException {
-    Encoder<Long> encoder = stringEncoder;
+    Encoder<Long> encoder = LongCombiner.STRING_ENCODER;
     // try aggregating across multiple data sets that contain
     // the exact same keys w/ different values
     
@@ -387,7 +383,7 @@ public class CombinerTest {
   
   @Test
   public void test6() throws IOException {
-    Encoder<Long> encoder = varNumEncoder;
+    Encoder<Long> encoder = LongCombiner.VAR_LEN_ENCODER;
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
     // keys that aggregate
@@ -398,7 +394,7 @@ public class CombinerTest {
     Combiner ai = new SummingCombiner();
     
     IteratorSetting is = new IteratorSetting(1, SummingCombiner.class);
-    LongCombiner.setEncodingType(is, VarNumEncoder.class.getName());
+    LongCombiner.setEncodingType(is, VarLenEncoder.class.getName());
     Combiner.setColumns(is, Collections.singletonList(new IteratorSetting.Column("cf001")));
     
     ai.init(new SortedMapIterator(tm1), is.getProperties(), new DefaultIteratorEnvironment());
@@ -413,7 +409,7 @@ public class CombinerTest {
   
   @Test
   public void test7() throws IOException {
-    Encoder<Long> encoder = longEncoder;
+    Encoder<Long> encoder = LongCombiner.FIXED_LEN_ENCODER;
     
     // test that delete is not aggregated
     
@@ -475,7 +471,7 @@ public class CombinerTest {
   
   @Test
   public void maxMinTest() throws IOException {
-    Encoder<Long> encoder = varNumEncoder;
+    Encoder<Long> encoder = LongCombiner.VAR_LEN_ENCODER;
     
     TreeMap<Key,Value> tm1 = new TreeMap<Key,Value>();
     
@@ -590,8 +586,8 @@ public class CombinerTest {
   
   @Test
   public void sumArrayTest() throws IOException, InstantiationException, IllegalAccessException {
-    sumArray(SummingArrayCombiner.VarNumArrayEncoder.class, "VARNUM");
-    sumArray(SummingArrayCombiner.LongArrayEncoder.class, "LONG");
+    sumArray(SummingArrayCombiner.VarLongArrayEncoder.class, "VARNUM");
+    sumArray(SummingArrayCombiner.FixedLongArrayEncoder.class, "LONG");
     sumArray(SummingArrayCombiner.StringArrayEncoder.class, "STRING");
   }
 }

Copied: incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IndexedDocIteratorTest.java (from r1208023, incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/FamilyIntersectingIteratorTest.java)
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IndexedDocIteratorTest.java?p2=incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IndexedDocIteratorTest.java&p1=incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/FamilyIntersectingIteratorTest.java&r1=1208023&r2=1208095&rev=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/FamilyIntersectingIteratorTest.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IndexedDocIteratorTest.java Tue Nov 29 21:54:22 2011
@@ -14,7 +14,7 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.accumulo.core.iterators;
+package org.apache.accumulo.core.iterators.user;
 
 import java.io.IOException;
 import java.util.ArrayList;
@@ -33,13 +33,15 @@ import org.apache.accumulo.core.data.Ran
 import org.apache.accumulo.core.data.Value;
 import org.apache.accumulo.core.file.rfile.RFileTest;
 import org.apache.accumulo.core.file.rfile.RFileTest.TestRFile;
+import org.apache.accumulo.core.iterators.DefaultIteratorEnvironment;
+import org.apache.accumulo.core.iterators.IteratorEnvironment;
+import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
 import org.apache.accumulo.core.iterators.system.MultiIterator;
-import org.apache.accumulo.core.iterators.user.IntersectingIterator;
 import org.apache.hadoop.io.Text;
 import org.apache.log4j.Level;
 import org.apache.log4j.Logger;
 
-public class FamilyIntersectingIteratorTest extends TestCase {
+public class IndexedDocIteratorTest extends TestCase {
   
   private static final Logger log = Logger.getLogger(IntersectingIterator.class);
   
@@ -53,7 +55,9 @@ public class FamilyIntersectingIteratorT
   Text[] otherColumnFamilies;
   
   static int docid = 0;
-  static Text docColf = new Text(FamilyIntersectingIterator.DEFAULT_DOC_COLF);
+  static String docColfPrefix = "doc";
+  static Text indexColf = new Text("index");
+  static Text docColf = new Text(docColfPrefix);
   
   static {
     log.setLevel(Level.OFF);
@@ -94,7 +98,7 @@ public class FamilyIntersectingIteratorT
             colq.append(doc.getBytes(), 0, doc.getLength());
             colq.append(nullByte, 0, 1);
             colq.append("stuff".getBytes(), 0, "stuff".length());
-            Key k = new Key(row, FamilyIntersectingIterator.DEFAULT_INDEX_COLF, colq);
+            Key k = new Key(row, indexColf, colq);
             map.put(k, v);
             sb.append(" ");
             sb.append(columnFamilies[j]);
@@ -115,7 +119,7 @@ public class FamilyIntersectingIteratorT
             colq.append(doc.getBytes(), 0, doc.getLength());
             colq.append(nullByte, 0, 1);
             colq.append("stuff".getBytes(), 0, "stuff".length());
-            Key k = new Key(row, FamilyIntersectingIterator.DEFAULT_INDEX_COLF, colq);
+            Key k = new Key(row, indexColf, colq);
             map.put(k, v);
             sb.append(" ");
             sb.append(cf);
@@ -148,9 +152,9 @@ public class FamilyIntersectingIteratorT
       if (entry.getKey().getColumnFamily().equals(docColf))
         trf.writer.append(entry.getKey(), entry.getValue());
     }
-    trf.writer.startNewLocalityGroup("terms", RFileTest.ncfs(FamilyIntersectingIterator.DEFAULT_INDEX_COLF.toString()));
+    trf.writer.startNewLocalityGroup("terms", RFileTest.ncfs(indexColf.toString()));
     for (Entry<Key,Value> entry : inMemoryMap.entrySet()) {
-      if (entry.getKey().getColumnFamily().equals(FamilyIntersectingIterator.DEFAULT_INDEX_COLF))
+      if (entry.getKey().getColumnFamily().equals(indexColf))
         trf.writer.append(entry.getKey(), entry.getValue());
     }
     
@@ -188,9 +192,10 @@ public class FamilyIntersectingIteratorT
     hitRatio = 0.5f;
     HashSet<Text> docs = new HashSet<Text>();
     SortedKeyValueIterator<Key,Value> source = createIteratorStack(NUM_ROWS, NUM_DOCIDS, columnFamilies, otherColumnFamilies, docs);
-    IteratorSetting is = new IteratorSetting(1, FamilyIntersectingIterator.class);
-    FamilyIntersectingIterator.setColumnFamilies(is, columnFamilies);
-    FamilyIntersectingIterator iter = new FamilyIntersectingIterator();
+    IteratorSetting is = new IteratorSetting(1, IndexedDocIterator.class);
+    IndexedDocIterator.setColumnFamilies(is, columnFamilies);
+    IndexedDocIterator.setColfs(is, indexColf.toString(), docColfPrefix);
+    IndexedDocIterator iter = new IndexedDocIterator();
     iter.init(source, is.getProperties(), env);
     iter.seek(new Range(), EMPTY_COL_FAMS, false);
     int hitCount = 0;
@@ -201,8 +206,9 @@ public class FamilyIntersectingIteratorT
       // System.out.println(k.toString());
       // System.out.println(iter.getDocID(k));
       
-      assertTrue(docs.contains(iter.getDocID(k)));
-      assertTrue(new String(v.get()).endsWith(" docID=" + iter.getDocID(k)));
+      Text d = IndexedDocIterator.parseDocID(k);
+      assertTrue(docs.contains(d));
+      assertTrue(new String(v.get()).endsWith(" docID=" + d));
       
       iter.next();
     }
@@ -224,9 +230,10 @@ public class FamilyIntersectingIteratorT
     hitRatio = 0.5f;
     HashSet<Text> docs = new HashSet<Text>();
     SortedKeyValueIterator<Key,Value> source = createIteratorStack(NUM_ROWS, NUM_DOCIDS, columnFamilies, otherColumnFamilies, docs);
-    IteratorSetting is = new IteratorSetting(1, FamilyIntersectingIterator.class);
-    FamilyIntersectingIterator.setColumnFamilies(is, columnFamilies);
-    FamilyIntersectingIterator iter = new FamilyIntersectingIterator();
+    IteratorSetting is = new IteratorSetting(1, IndexedDocIterator.class);
+    IndexedDocIterator.setColumnFamilies(is, columnFamilies);
+    IndexedDocIterator.setColfs(is, indexColf.toString(), docColfPrefix);
+    IndexedDocIterator iter = new IndexedDocIterator();
     iter.init(source, is.getProperties(), env);
     iter.seek(new Range(), EMPTY_COL_FAMS, false);
     int hitCount = 0;
@@ -234,8 +241,9 @@ public class FamilyIntersectingIteratorT
       hitCount++;
       Key k = iter.getTopKey();
       Value v = iter.getTopValue();
-      assertTrue(docs.contains(iter.getDocID(k)));
-      assertTrue(new String(v.get()).endsWith(" docID=" + iter.getDocID(k)));
+      Text d = IndexedDocIterator.parseDocID(k);
+      assertTrue(docs.contains(d));
+      assertTrue(new String(v.get()).endsWith(" docID=" + d));
       iter.next();
     }
     assertEquals(hitCount, docs.size());
@@ -264,9 +272,10 @@ public class FamilyIntersectingIteratorT
     sourceIters.add(source);
     sourceIters.add(source2);
     MultiIterator mi = new MultiIterator(sourceIters, false);
-    IteratorSetting is = new IteratorSetting(1, FamilyIntersectingIterator.class);
-    FamilyIntersectingIterator.setColumnFamilies(is, columnFamilies);
-    FamilyIntersectingIterator iter = new FamilyIntersectingIterator();
+    IteratorSetting is = new IteratorSetting(1, IndexedDocIterator.class);
+    IndexedDocIterator.setColumnFamilies(is, columnFamilies);
+    IndexedDocIterator.setColfs(is, indexColf.toString(), docColfPrefix);
+    IndexedDocIterator iter = new IndexedDocIterator();
     iter.init(mi, is.getProperties(), env);
     iter.seek(new Range(), EMPTY_COL_FAMS, false);
     int hitCount = 0;
@@ -274,8 +283,9 @@ public class FamilyIntersectingIteratorT
       hitCount++;
       Key k = iter.getTopKey();
       Value v = iter.getTopValue();
-      assertTrue(docs.contains(iter.getDocID(k)));
-      assertTrue(new String(v.get()).endsWith(" docID=" + iter.getDocID(k)));
+      Text d = IndexedDocIterator.parseDocID(k);
+      assertTrue(docs.contains(d));
+      assertTrue(new String(v.get()).endsWith(" docID=" + d));
       iter.next();
     }
     assertEquals(hitCount, docs.size());
@@ -303,9 +313,10 @@ public class FamilyIntersectingIteratorT
     hitRatio = 0.5f;
     HashSet<Text> docs = new HashSet<Text>();
     SortedKeyValueIterator<Key,Value> source = createIteratorStack(NUM_ROWS, NUM_DOCIDS, columnFamilies, otherColumnFamilies, docs, negatedColumns);
-    IteratorSetting is = new IteratorSetting(1, FamilyIntersectingIterator.class);
-    FamilyIntersectingIterator.setColumnFamilies(is, columnFamilies, notFlags);
-    FamilyIntersectingIterator iter = new FamilyIntersectingIterator();
+    IteratorSetting is = new IteratorSetting(1, IndexedDocIterator.class);
+    IndexedDocIterator.setColumnFamilies(is, columnFamilies, notFlags);
+    IndexedDocIterator.setColfs(is, indexColf.toString(), docColfPrefix);
+    IndexedDocIterator iter = new IndexedDocIterator();
     iter.init(source, is.getProperties(), env);
     iter.seek(new Range(), EMPTY_COL_FAMS, false);
     int hitCount = 0;
@@ -313,8 +324,9 @@ public class FamilyIntersectingIteratorT
       hitCount++;
       Key k = iter.getTopKey();
       Value v = iter.getTopValue();
-      assertTrue(docs.contains(iter.getDocID(k)));
-      assertTrue(new String(v.get()).endsWith(" docID=" + iter.getDocID(k)));
+      Text d = IndexedDocIterator.parseDocID(k);
+      assertTrue(docs.contains(d));
+      assertTrue(new String(v.get()).endsWith(" docID=" + d));
       iter.next();
     }
     assertTrue(hitCount == docs.size());

Copied: incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest.java (from r1208023, incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/IntersectingIteratorTest.java)
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest.java?p2=incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest.java&p1=incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/IntersectingIteratorTest.java&r1=1208023&r2=1208095&rev=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/IntersectingIteratorTest.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/IntersectingIteratorTest.java Tue Nov 29 21:54:22 2011
@@ -14,24 +14,37 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.accumulo.core.iterators;
+package org.apache.accumulo.core.iterators.user;
 
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collection;
+import java.util.Collections;
 import java.util.HashSet;
+import java.util.Iterator;
+import java.util.Map.Entry;
 import java.util.Random;
 import java.util.TreeMap;
 
+import junit.framework.Assert;
 import junit.framework.TestCase;
 
+import org.apache.accumulo.core.Constants;
+import org.apache.accumulo.core.client.BatchScanner;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.Connector;
 import org.apache.accumulo.core.client.IteratorSetting;
+import org.apache.accumulo.core.client.mock.MockInstance;
 import org.apache.accumulo.core.data.ByteSequence;
 import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
 import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.iterators.DefaultIteratorEnvironment;
+import org.apache.accumulo.core.iterators.IteratorEnvironment;
+import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
+import org.apache.accumulo.core.iterators.SortedMapIterator;
 import org.apache.accumulo.core.iterators.system.MultiIterator;
-import org.apache.accumulo.core.iterators.user.IntersectingIterator;
 import org.apache.hadoop.io.Text;
 import org.apache.log4j.Level;
 import org.apache.log4j.Logger;
@@ -257,4 +270,29 @@ public class IntersectingIteratorTest ex
     assertTrue(hitCount == docs.size());
     cleanup();
   }
+  
+  public void testWithBatchScanner() throws Exception {
+    Value empty = new Value(new byte[] {});
+    MockInstance inst = new MockInstance("mockabye");
+    Connector connector = inst.getConnector("user", "pass");
+    connector.tableOperations().create("index");
+    BatchWriter bw = connector.createBatchWriter("index", 1000, 1000, 1);
+    Mutation m = new Mutation("000012");
+    m.put("rvy", "5000000000000000", empty);
+    m.put("15qh", "5000000000000000", empty);
+    bw.addMutation(m);
+    bw.close();
+    
+    BatchScanner bs = connector.createBatchScanner("index", Constants.NO_AUTHS, 10);
+    IteratorSetting ii = new IteratorSetting(20, IntersectingIterator.class);
+    IntersectingIterator.setColumnFamilies(ii, new Text[] {new Text("rvy"), new Text("15qh")});
+    bs.addScanIterator(ii);
+    bs.setRanges(Collections.singleton(new Range()));
+    Iterator<Entry<Key,Value>> iterator = bs.iterator();
+    Assert.assertTrue(iterator.hasNext());
+    Entry<Key,Value> next = iterator.next();
+    Key key = next.getKey();
+    Assert.assertEquals(key.getColumnQualifier(), new Text("5000000000000000"));
+    Assert.assertFalse(iterator.hasNext());
+  }
 }

Modified: incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/RowDeletingIteratorTest.java Tue Nov 29 21:54:22 2011
@@ -30,12 +30,10 @@ import org.apache.accumulo.core.data.Key
 import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
 import org.apache.accumulo.core.iterators.IteratorEnvironment;
-import org.apache.accumulo.core.iterators.IteratorUtil;
+import org.apache.accumulo.core.iterators.IteratorUtil.IteratorScope;
 import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
 import org.apache.accumulo.core.iterators.SortedMapIterator;
-import org.apache.accumulo.core.iterators.IteratorUtil.IteratorScope;
 import org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator;
-import org.apache.accumulo.core.iterators.user.RowDeletingIterator;
 import org.apache.hadoop.io.Text;
 
 public class RowDeletingIteratorTest extends TestCase {

Modified: incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/VersioningIteratorTest.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/VersioningIteratorTest.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/VersioningIteratorTest.java (original)
+++ incubator/accumulo/branches/1.4/src/core/src/test/java/org/apache/accumulo/core/iterators/user/VersioningIteratorTest.java Tue Nov 29 21:54:22 2011
@@ -37,7 +37,7 @@ import org.apache.hadoop.io.Text;
 public class VersioningIteratorTest extends TestCase {
   // add test for seek function
   private static final Collection<ByteSequence> EMPTY_COL_FAMS = new ArrayList<ByteSequence>();
-  private static final Encoder<Long> encoder = new LongCombiner.LongEncoder();
+  private static final Encoder<Long> encoder = LongCombiner.FIXED_LEN_ENCODER;
   
   void createTestData(TreeMap<Key,Value> tm, Text colf, Text colq) {
     for (int i = 0; i < 2; i++) {

Modified: incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java (original)
+++ incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java Tue Nov 29 21:54:22 2011
@@ -26,7 +26,8 @@ import org.apache.accumulo.core.client.C
 import org.apache.accumulo.core.client.ZooKeeperInstance;
 import org.apache.accumulo.core.data.Mutation;
 import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.core.iterators.aggregation.LongSummation;
+import org.apache.accumulo.core.iterators.LongCombiner;
+import org.apache.accumulo.core.iterators.TypedValueCombiner.Encoder;
 import org.apache.accumulo.core.security.Authorizations;
 import org.apache.accumulo.core.security.ColumnVisibility;
 import org.apache.accumulo.examples.filedata.FileDataIngest;
@@ -39,6 +40,7 @@ public class Ingest {
   public static final String EXEC_CQ = "exec";
   public static final String LASTMOD_CQ = "lastmod";
   public static final String HASH_CQ = "md5";
+  public static final Encoder<Long> encoder = LongCombiner.FIXED_LEN_ENCODER;
   
   public static Mutation buildMutation(ColumnVisibility cv, String path, boolean isDir, boolean isHidden, boolean canExec, long length, long lastmod,
       String hash) {
@@ -49,7 +51,7 @@ public class Ingest {
     if (isDir)
       colf = QueryUtil.DIR_COLF;
     else
-      colf = new Text(LongSummation.longToBytes(Long.MAX_VALUE - lastmod));
+      colf = new Text(encoder.encode(Long.MAX_VALUE - lastmod));
     m.put(colf, new Text(LENGTH_CQ), cv, new Value(Long.toString(length).getBytes()));
     m.put(colf, new Text(HIDDEN_CQ), cv, new Value(Boolean.toString(isHidden).getBytes()));
     m.put(colf, new Text(EXEC_CQ), cv, new Value(Boolean.toString(canExec).getBytes()));

Modified: incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java (original)
+++ incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java Tue Nov 29 21:54:22 2011
@@ -16,7 +16,6 @@
  */
 package org.apache.accumulo.examples.dirlist;
 
-import java.io.IOException;
 import java.util.Map;
 import java.util.Map.Entry;
 import java.util.TreeMap;
@@ -31,7 +30,6 @@ import org.apache.accumulo.core.client.Z
 import org.apache.accumulo.core.data.Key;
 import org.apache.accumulo.core.data.Range;
 import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.core.iterators.aggregation.LongSummation;
 import org.apache.accumulo.core.iterators.user.RegExFilter;
 import org.apache.accumulo.core.security.Authorizations;
 import org.apache.hadoop.io.Text;
@@ -93,11 +91,7 @@ public class QueryUtil {
   public static String getType(Text colf) {
     if (colf.equals(DIR_COLF))
       return colf.toString() + ":";
-    try {
-      return Long.toString(LongSummation.bytesToLong(colf.getBytes())) + ":";
-    } catch (IOException e) {
-      return colf.toString() + ":";
-    }
+    return Long.toString(Ingest.encoder.decode(colf.getBytes())) + ":";
   }
   
   public Map<String,String> getData(String path) throws TableNotFoundException {

Modified: incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/shard/Query.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/shard/Query.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/shard/Query.java (original)
+++ incubator/accumulo/branches/1.4/src/examples/src/main/java/org/apache/accumulo/examples/shard/Query.java Tue Nov 29 21:54:22 2011
@@ -63,7 +63,7 @@ public class Query {
       columns[i - 5] = new Text(args[i]);
     }
     IteratorSetting ii = new IteratorSetting(20, "ii", IntersectingIterator.class);
-    ii.addOption(IntersectingIterator.columnFamiliesOptionName, IntersectingIterator.encodeColumns(columns));
+    IntersectingIterator.setColumnFamilies(ii, columns);
     bs.addScanIterator(ii);
     bs.setRanges(Collections.singleton(new Range()));
     for (Entry<Key,Value> entry : bs) {

Modified: incubator/accumulo/branches/1.4/src/examples/src/test/java/org/apache/accumulo/examples/dirlist/CountTest.java
URL: http://svn.apache.org/viewvc/incubator/accumulo/branches/1.4/src/examples/src/test/java/org/apache/accumulo/examples/dirlist/CountTest.java?rev=1208095&r1=1208094&r2=1208095&view=diff
==============================================================================
--- incubator/accumulo/branches/1.4/src/examples/src/test/java/org/apache/accumulo/examples/dirlist/CountTest.java (original)
+++ incubator/accumulo/branches/1.4/src/examples/src/test/java/org/apache/accumulo/examples/dirlist/CountTest.java Tue Nov 29 21:54:22 2011
@@ -16,40 +16,21 @@
  */
 package org.apache.accumulo.examples.dirlist;
 
-import java.io.IOException;
 import java.util.ArrayList;
-import java.util.Map;
 import java.util.Map.Entry;
-import java.util.TreeMap;
 
 import junit.framework.TestCase;
 
 import org.apache.accumulo.core.client.BatchWriter;
 import org.apache.accumulo.core.client.Connector;
 import org.apache.accumulo.core.client.Scanner;
-import org.apache.accumulo.core.client.mapreduce.AccumuloInputFormat;
-import org.apache.accumulo.core.client.mapreduce.InputFormatBase.RangeInputSplit;
 import org.apache.accumulo.core.client.mock.MockInstance;
 import org.apache.accumulo.core.data.Key;
 import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.core.iterators.aggregation.Aggregator;
 import org.apache.accumulo.core.security.Authorizations;
 import org.apache.accumulo.core.security.ColumnVisibility;
 import org.apache.accumulo.core.util.Pair;
-import org.apache.accumulo.examples.dirlist.FileCount;
-import org.apache.accumulo.examples.dirlist.FileCountMR;
-import org.apache.accumulo.examples.dirlist.Ingest;
-import org.apache.accumulo.examples.dirlist.QueryUtil;
-import org.apache.accumulo.examples.dirlist.StringArraySummation;
-import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.io.Text;
-import org.apache.hadoop.mapreduce.JobContext;
-import org.apache.hadoop.mapreduce.JobID;
-import org.apache.hadoop.mapreduce.Mapper;
-import org.apache.hadoop.mapreduce.RecordReader;
-import org.apache.hadoop.mapreduce.RecordWriter;
-import org.apache.hadoop.mapreduce.TaskAttemptContext;
-import org.apache.hadoop.mapreduce.TaskAttemptID;
 
 public class CountTest extends TestCase {
   {
@@ -74,65 +55,7 @@ public class CountTest extends TestCase 
     }
   }
   
-  public static class AggregatingMap extends TreeMap<Key,Value> {
-    private static final long serialVersionUID = -6644406149713336633L;
-    private Aggregator agg;
-    
-    public AggregatingMap(Aggregator agg) {
-      this.agg = agg;
-    }
-    
-    @Override
-    public Value put(Key key, Value value) {
-      if (!this.containsKey(key))
-        return super.put(key, value);
-      agg.reset();
-      agg.collect(value);
-      agg.collect(this.get(key));
-      return super.put(key, agg.aggregate());
-    }
-  }
-  
   public void test() throws Exception {
-    JobContext job = new JobContext(new Configuration(), new JobID());
-    AccumuloInputFormat.setInputInfo(job, "root", "".getBytes(), "dirlisttable", new Authorizations());
-    AccumuloInputFormat.setMockInstance(job, "counttest");
-    AccumuloInputFormat cif = new AccumuloInputFormat();
-    RangeInputSplit ris = new RangeInputSplit();
-    TaskAttemptContext tac = new TaskAttemptContext(job.getConfiguration(), new TaskAttemptID());
-    RecordReader<Key,Value> rr = cif.createRecordReader(ris, tac);
-    rr.initialize(ris, tac);
-    FileCountMR.FileCountMapper mapper = new FileCountMR.FileCountMapper();
-    RecordWriter<Key,Value> rw = new RecordWriter<Key,Value>() {
-      Map<Key,Value> aggmap = new AggregatingMap(new StringArraySummation());
-      
-      @Override
-      public void write(Key key, Value value) throws IOException, InterruptedException {
-        aggmap.put(key, value);
-      }
-      
-      @Override
-      public void close(TaskAttemptContext context) throws IOException, InterruptedException {
-        ArrayList<Pair<String,String>> expected = new ArrayList<Pair<String,String>>();
-        expected.add(new Pair<String,String>("", "1,0,3,3"));
-        expected.add(new Pair<String,String>("/local", "2,1,2,3"));
-        expected.add(new Pair<String,String>("/local/user1", "0,2,0,2"));
-        
-        int i = 0;
-        for (Entry<Key,Value> e : aggmap.entrySet()) {
-          assertEquals(e.getKey().getRow().toString(), expected.get(i).getFirst());
-          assertEquals(e.getValue().toString(), expected.get(i).getSecond());
-          i++;
-        }
-        assertEquals(aggmap.entrySet().size(), expected.size());
-      }
-    };
-    Mapper<Key,Value,Key,Value>.Context context = mapper.new Context(job.getConfiguration(), new TaskAttemptID(), rr, rw, null, null, ris);
-    mapper.run(context);
-    rw.close(context);
-  }
-  
-  public void test2() throws Exception {
     Scanner scanner = new MockInstance("counttest").getConnector("root", "".getBytes()).createScanner("dirlisttable", new Authorizations());
     scanner.fetchColumn(new Text("dir"), new Text("counts"));
     assertFalse(scanner.iterator().hasNext());



Mime
View raw message