lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: svn commit: r413201 [1/2] - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/swing/src/java/org/apache/lucene/swing/models/ src/java/org/apache/lucene/analysis/ src/java/org/apache/lucene/document/ src/java/org/...
Date Sat, 10 Jun 2006 10:23:40 GMT
oops, will take care of it today!  It wouldn't be right if my first real 
commit went through with no problems!  :-)

Otis Gospodnetic wrote:
> Nice and meaty :)
> I saw CNLP copyright notices in javadocs.  I think those are from your IntelliJ templates and have to be removed or some peeps will get mad at us...
>
> Otis
>
> ----- Original Message ----
> From: gsingers@apache.org
> To: java-commits@lucene.apache.org
> Sent: Friday, June 9, 2006 9:23:24 PM
> Subject: svn commit: r413201 [1/2] - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/swing/src/java/org/apache/lucene/swing/models/ src/java/org/apache/lucene/analysis/ src/java/org/apache/lucene/document/ src/java/org/...
>
> Author: gsingers
> Date: Fri Jun  9 18:23:22 2006
> New Revision: 413201
>
> URL: http://svn.apache.org/viewvc?rev=413201&view=rev
> Log:
> Implementation of Issue 545.  Introduction of new Fieldable interface (extracted from Field) which is now used where Field used to be used.  Field now implements Fieldable.
> Added new method to IndexReader and derived classes for working with the new FieldSelector interface.  The FieldSelector interface defines a mechanism for doing lazy loading, amongst other things.  Implemented Lazy loading of fields in the FieldsReader class.  Added test case to TestFieldsReader.java
>
> Added:
>     lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java
>     lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java   (with props)
>     lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java   (with props)
> Modified:
>     lucene/java/trunk/CHANGES.txt
>     lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java
>     lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java
>     lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java
>     lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java
>     lucene/java/trunk/src/java/org/apache/lucene/document/Document.java
>     lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/ParallelReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/SegmentReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsReader.java
>     lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsWriter.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/FieldCacheImpl.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/FieldDocSortedHitQueue.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/FieldSortedHitQueue.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/Similarity.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/Sort.java
>     lucene/java/trunk/src/java/org/apache/lucene/search/SortComparatorSource.java
>     lucene/java/trunk/src/java/org/apache/lucene/store/IndexInput.java
>     lucene/java/trunk/src/test/org/apache/lucene/document/TestBinaryDocument.java
>     lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/DocHelper.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestDocumentWriter.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestFieldsReader.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexInput.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexModifier.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestParallelReader.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestSegmentMerger.java
>     lucene/java/trunk/src/test/org/apache/lucene/index/TestSegmentReader.java
>     lucene/java/trunk/src/test/org/apache/lucene/search/TestDocBoost.java
>     lucene/java/trunk/src/test/org/apache/lucene/search/TestMultiThreadTermVectors.java
>     lucene/java/trunk/src/test/org/apache/lucene/search/TestPhraseQuery.java
>     lucene/java/trunk/src/test/org/apache/lucene/search/TestSetNorm.java
>
> Modified: lucene/java/trunk/CHANGES.txt
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/CHANGES.txt (original)
> +++ lucene/java/trunk/CHANGES.txt Fri Jun  9 18:23:22 2006
> @@ -9,6 +9,9 @@
>   1. LUCENE-503: New ThaiAnalyzer and ThaiWordFilter in contrib/analyzers
>      (Samphan Raruenrom va Chris Hostetter)
>  
> + 2. LUCENE-545: New FieldSelector API and associated changes to IndexReader and implementations.
> +    New Fieldable interface for use with the lazy field loading mechanism. (Grant Ingersoll and Chuck Williams via Grant Ingersoll)
> +
>  API Changes
>  
>   1. LUCENE-438: Remove "final" from Token, implement Cloneable, allow
>
> Modified: lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java (original)
> +++ lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java Fri Jun  9 18:23:22 2006
> @@ -16,20 +16,11 @@
>   * limitations under the License.
>   */
>  
> -import java.io.IOException;
> -import java.io.Serializable;
> -import java.util.Arrays;
> -import java.util.Collection;
> -import java.util.Collections;
> -import java.util.Comparator;
> -import java.util.HashMap;
> -import java.util.Iterator;
> -import java.util.Map;
> -
>  import org.apache.lucene.analysis.Analyzer;
>  import org.apache.lucene.analysis.Token;
>  import org.apache.lucene.analysis.TokenStream;
>  import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.FieldSelector;
>  import org.apache.lucene.index.IndexReader;
>  import org.apache.lucene.index.Term;
>  import org.apache.lucene.index.TermDocs;
> @@ -43,6 +34,16 @@
>  import org.apache.lucene.search.Searcher;
>  import org.apache.lucene.search.Similarity;
>  
> +import java.io.IOException;
> +import java.io.Serializable;
> +import java.util.Arrays;
> +import java.util.Collection;
> +import java.util.Collections;
> +import java.util.Comparator;
> +import java.util.HashMap;
> +import java.util.Iterator;
> +import java.util.Map;
> +
>  /**
>   * High-performance single-document main memory Apache Lucene fulltext search index. 
>   * 
> @@ -1004,8 +1005,14 @@
>              if (DEBUG) System.err.println("MemoryIndexReader.document");
>              return new Document(); // there are no stored fields
>          }
> -    
> -        public boolean isDeleted(int n) {
> +
> +    //When we convert to JDK 1.5 make this Set<String>
> +    public Document document(int n, FieldSelector fieldSelector) throws IOException {
> +      if (DEBUG) System.err.println("MemoryIndexReader.document");
> +            return new Document(); // there are no stored fields
> +    }
> +
> +    public boolean isDeleted(int n) {
>              if (DEBUG) System.err.println("MemoryIndexReader.isDeleted");
>              return false;
>          }
>
> Modified: lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java (original)
> +++ lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java Fri Jun  9 18:23:22 2006
> @@ -22,6 +22,7 @@
>  import org.apache.lucene.index.IndexWriter;
>  import org.apache.lucene.document.Document;
>  import org.apache.lucene.document.Field;
> +import org.apache.lucene.document.Fieldable;
>  import org.apache.lucene.search.IndexSearcher;
>  import org.apache.lucene.search.Query;
>  import org.apache.lucene.search.Hits;
> @@ -190,7 +191,7 @@
>              //tabble model row that we are mapping to
>              for (int t=0; t<hits.length(); t++){
>                  Document document = hits.doc(t);
> -                Field field = document.getField(ROW_NUMBER);
> +                Fieldable field = document.getField(ROW_NUMBER);
>                  rowToModelIndex.add(new Integer(field.stringValue()));
>              }
>          } catch (Exception e){
>
> Modified: lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java (original)
> +++ lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java Fri Jun  9 18:23:22 2006
> @@ -16,26 +16,23 @@
>   * limitations under the License.
>   */
>  
> -import org.apache.lucene.store.RAMDirectory;
> +import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.analysis.WhitespaceAnalyzer;
>  import org.apache.lucene.document.Document;
>  import org.apache.lucene.document.Field;
> -import org.apache.lucene.analysis.WhitespaceAnalyzer;
> -import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.document.Fieldable;
>  import org.apache.lucene.index.IndexWriter;
> +import org.apache.lucene.queryParser.MultiFieldQueryParser;
> +import org.apache.lucene.search.Hits;
>  import org.apache.lucene.search.IndexSearcher;
>  import org.apache.lucene.search.Query;
> -import org.apache.lucene.search.Hits;
> -import org.apache.lucene.queryParser.MultiFieldQueryParser;
> -
> -import java.awt.*;
> -import java.awt.event.*;
> -import java.util.*;
> -import java.util.List;
> +import org.apache.lucene.store.RAMDirectory;
>  
> -import javax.swing.*;
>  import javax.swing.event.TableModelEvent;
>  import javax.swing.event.TableModelListener;
> -import javax.swing.table.*;
> +import javax.swing.table.AbstractTableModel;
> +import javax.swing.table.TableModel;
> +import java.util.ArrayList;
>  
>  
>  /**
> @@ -275,7 +272,7 @@
>              //tabble model row that we are mapping to
>              for (int t=0; t<hits.length(); t++){
>                  Document document = hits.doc(t);
> -                Field field = document.getField(ROW_NUMBER);
> +                Fieldable field = document.getField(ROW_NUMBER);
>                  rowToModelIndex.add(new Integer(field.stringValue()));
>              }
>          } catch (Exception e){
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java Fri Jun  9 18:23:22 2006
> @@ -38,16 +38,16 @@
>  
>  
>    /**
> -   * Invoked before indexing a Field instance if
> +   * Invoked before indexing a Fieldable instance if
>     * terms have already been added to that field.  This allows custom
>     * analyzers to place an automatic position increment gap between
> -   * Field instances using the same field name.  The default value
> +   * Fieldable instances using the same field name.  The default value
>     * position increment gap is 0.  With a 0 position increment gap and
>     * the typical default token position increment of 1, all terms in a field,
> -   * including across Field instances, are in successive positions, allowing
> -   * exact PhraseQuery matches, for instance, across Field instance boundaries.
> +   * including across Fieldable instances, are in successive positions, allowing
> +   * exact PhraseQuery matches, for instance, across Fieldable instance boundaries.
>     *
> -   * @param fieldName Field name being indexed.
> +   * @param fieldName Fieldable name being indexed.
>     * @return position increment gap, added to the next token emitted from {@link #tokenStream(String,Reader)}
>     */
>    public int getPositionIncrementGap(String fieldName)
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,274 @@
> +package org.apache.lucene.document;
> +/**
> + * Copyright 2006 The Apache Software Foundation
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +
> +/**
> + *
> + *
> + **/
> +public abstract class AbstractField implements Fieldable {
> +
> +  protected String name = "body";
> +  protected boolean storeTermVector = false;
> +  protected boolean storeOffsetWithTermVector = false;
> +  protected boolean storePositionWithTermVector = false;
> +  protected boolean omitNorms = false;
> +  protected boolean isStored = false;
> +  protected boolean isIndexed = true;
> +  protected boolean isTokenized = true;
> +  protected boolean isBinary = false;
> +  protected boolean isCompressed = false;
> +  protected boolean lazy = false;
> +  protected float boost = 1.0f;
> +  // the one and only data object for all different kind of field values
> +  protected Object fieldsData = null;
> +
> +  protected AbstractField()
> +  {
> +    
> +  }
> +
> +  protected AbstractField(String name, Field.Store store, Field.Index index, Field.TermVector termVector) {
> +    if (name == null)
> +      throw new NullPointerException("name cannot be null");
> +    this.name = name.intern();        // field names are interned
> +
> +    if (store == Field.Store.YES){
> +      this.isStored = true;
> +      this.isCompressed = false;
> +    }
> +    else if (store == Field.Store.COMPRESS) {
> +      this.isStored = true;
> +      this.isCompressed = true;
> +    }
> +    else if (store == Field.Store.NO){
> +      this.isStored = false;
> +      this.isCompressed = false;
> +    }
> +    else
> +      throw new IllegalArgumentException("unknown store parameter " + store);
> +
> +    if (index == Field.Index.NO) {
> +      this.isIndexed = false;
> +      this.isTokenized = false;
> +    } else if (index == Field.Index.TOKENIZED) {
> +      this.isIndexed = true;
> +      this.isTokenized = true;
> +    } else if (index == Field.Index.UN_TOKENIZED) {
> +      this.isIndexed = true;
> +      this.isTokenized = false;
> +    } else if (index == Field.Index.NO_NORMS) {
> +      this.isIndexed = true;
> +      this.isTokenized = false;
> +      this.omitNorms = true;
> +    } else {
> +      throw new IllegalArgumentException("unknown index parameter " + index);
> +    }
> +
> +    this.isBinary = false;
> +
> +    setStoreTermVector(termVector);
> +  }
> +
> +  /** Sets the boost factor hits on this field.  This value will be
> +   * multiplied into the score of all hits on this this field of this
> +   * document.
> +   *
> +   * <p>The boost is multiplied by {@link org.apache.lucene.document.Document#getBoost()} of the document
> +   * containing this field.  If a document has multiple fields with the same
> +   * name, all such values are multiplied together.  This product is then
> +   * multipled by the value {@link org.apache.lucene.search.Similarity#lengthNorm(String,int)}, and
> +   * rounded by {@link org.apache.lucene.search.Similarity#encodeNorm(float)} before it is stored in the
> +   * index.  One should attempt to ensure that this product does not overflow
> +   * the range of that encoding.
> +   *
> +   * @see org.apache.lucene.document.Document#setBoost(float)
> +   * @see org.apache.lucene.search.Similarity#lengthNorm(String, int)
> +   * @see org.apache.lucene.search.Similarity#encodeNorm(float)
> +   */
> +  public void setBoost(float boost) {
> +    this.boost = boost;
> +  }
> +
> +  /** Returns the boost factor for hits for this field.
> +   *
> +   * <p>The default value is 1.0.
> +   *
> +   * <p>Note: this value is not stored directly with the document in the index.
> +   * Documents returned from {@link org.apache.lucene.index.IndexReader#document(int)} and
> +   * {@link org.apache.lucene.search.Hits#doc(int)} may thus not have the same value present as when
> +   * this field was indexed.
> +   *
> +   * @see #setBoost(float)
> +   */
> +  public float getBoost() {
> +    return boost;
> +  }
> +
> +  /** Returns the name of the field as an interned string.
> +   * For example "date", "title", "body", ...
> +   */
> +  public String name()    { return name; }
> +
> +  protected void setStoreTermVector(Field.TermVector termVector) {
> +    if (termVector == Field.TermVector.NO) {
> +      this.storeTermVector = false;
> +      this.storePositionWithTermVector = false;
> +      this.storeOffsetWithTermVector = false;
> +    }
> +    else if (termVector == Field.TermVector.YES) {
> +      this.storeTermVector = true;
> +      this.storePositionWithTermVector = false;
> +      this.storeOffsetWithTermVector = false;
> +    }
> +    else if (termVector == Field.TermVector.WITH_POSITIONS) {
> +      this.storeTermVector = true;
> +      this.storePositionWithTermVector = true;
> +      this.storeOffsetWithTermVector = false;
> +    }
> +    else if (termVector == Field.TermVector.WITH_OFFSETS) {
> +      this.storeTermVector = true;
> +      this.storePositionWithTermVector = false;
> +      this.storeOffsetWithTermVector = true;
> +    }
> +    else if (termVector == Field.TermVector.WITH_POSITIONS_OFFSETS) {
> +      this.storeTermVector = true;
> +      this.storePositionWithTermVector = true;
> +      this.storeOffsetWithTermVector = true;
> +    }
> +    else {
> +      throw new IllegalArgumentException("unknown termVector parameter " + termVector);
> +    }
> +  }
> +
> +  /** True iff the value of the field is to be stored in the index for return
> +    with search hits.  It is an error for this to be true if a field is
> +    Reader-valued. */
> +  public final boolean  isStored()  { return isStored; }
> +
> +  /** True iff the value of the field is to be indexed, so that it may be
> +    searched on. */
> +  public final boolean  isIndexed()   { return isIndexed; }
> +
> +  /** True iff the value of the field should be tokenized as text prior to
> +    indexing.  Un-tokenized fields are indexed as a single word and may not be
> +    Reader-valued. */
> +  public final boolean  isTokenized()   { return isTokenized; }
> +
> +  /** True if the value of the field is stored and compressed within the index */
> +  public final boolean  isCompressed()   { return isCompressed; }
> +
> +  /** True iff the term or terms used to index this field are stored as a term
> +   *  vector, available from {@link org.apache.lucene.index.IndexReader#getTermFreqVector(int,String)}.
> +   *  These methods do not provide access to the original content of the field,
> +   *  only to terms used to index it. If the original content must be
> +   *  preserved, use the <code>stored</code> attribute instead.
> +   *
> +   * @see org.apache.lucene.index.IndexReader#getTermFreqVector(int, String)
> +   */
> +  public final boolean isTermVectorStored() { return storeTermVector; }
> +
> +  /**
> +   * True iff terms are stored as term vector together with their offsets 
> +   * (start and end positon in source text).
> +   */
> +  public boolean isStoreOffsetWithTermVector(){
> +    return storeOffsetWithTermVector;
> +  }
> +
> +  /**
> +   * True iff terms are stored as term vector together with their token positions.
> +   */
> +  public boolean isStorePositionWithTermVector(){
> +    return storePositionWithTermVector;
> +  }
> +
> +  /** True iff the value of the filed is stored as binary */
> +  public final boolean  isBinary()      { return isBinary; }
> +
> +  /** True if norms are omitted for this indexed field */
> +  public boolean getOmitNorms() { return omitNorms; }
> +
> +  /** Expert:
> +   *
> +   * If set, omit normalization factors associated with this indexed field.
> +   * This effectively disables indexing boosts and length normalization for this field.
> +   */
> +  public void setOmitNorms(boolean omitNorms) { this.omitNorms=omitNorms; }
> +
> +  public boolean isLazy() {
> +    return lazy;
> +  }
> +
> +  /** Prints a Field for human consumption. */
> +  public final String toString() {
> +    StringBuffer result = new StringBuffer();
> +    if (isStored) {
> +      result.append("stored");
> +      if (isCompressed)
> +        result.append("/compressed");
> +      else
> +        result.append("/uncompressed");
> +    }
> +    if (isIndexed) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("indexed");
> +    }
> +    if (isTokenized) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("tokenized");
> +    }
> +    if (storeTermVector) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("termVector");
> +    }
> +    if (storeOffsetWithTermVector) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("termVectorOffsets");
> +    }
> +    if (storePositionWithTermVector) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("termVectorPosition");
> +    }
> +    if (isBinary) {
> +      if (result.length() > 0)
> +        result.append(",");
> +      result.append("binary");
> +    }
> +    if (omitNorms) {
> +      result.append(",omitNorms");
> +    }
> +    if (lazy){
> +      result.append(",lazy");
> +    }
> +    result.append('<');
> +    result.append(name);
> +    result.append(':');
> +
> +    if (fieldsData != null && lazy == false) {
> +      result.append(fieldsData);
> +    }
> +
> +    result.append('>');
> +    return result.toString();
> +  }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Document.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Document.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Document.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Document.java Fri Jun  9 18:23:22 2006
> @@ -16,24 +16,21 @@
>   * limitations under the License.
>   */
>  
> -import java.util.Enumeration;
> -import java.util.Iterator;
> -import java.util.List;
> -import java.util.ArrayList;
> -import java.util.Vector;
> -import org.apache.lucene.index.IndexReader;       // for javadoc
> -import org.apache.lucene.search.Searcher;         // for javadoc
> -import org.apache.lucene.search.Hits;             // for javadoc
> +import org.apache.lucene.index.IndexReader;
> +import org.apache.lucene.search.Hits;
> +import org.apache.lucene.search.Searcher;
> +
> +import java.util.*;             // for javadoc
>  
>  /** Documents are the unit of indexing and search.
>   *
>   * A Document is a set of fields.  Each field has a name and a textual value.
> - * A field may be {@link Field#isStored() stored} with the document, in which
> + * A field may be {@link Fieldable#isStored() stored} with the document, in which
>   * case it is returned with search hits on the document.  Thus each document
>   * should typically contain one or more stored fields which uniquely identify
>   * it.
>   *
> - * <p>Note that fields which are <i>not</i> {@link Field#isStored() stored} are
> + * <p>Note that fields which are <i>not</i> {@link Fieldable#isStored() stored} are
>   * <i>not</i> available in documents retrieved from the index, e.g. with {@link
>   * Hits#doc(int)}, {@link Searcher#doc(int)} or {@link
>   * IndexReader#document(int)}.
> @@ -50,11 +47,11 @@
>    /** Sets a boost factor for hits on any field of this document.  This value
>     * will be multiplied into the score of all hits on this document.
>     *
> -   * <p>Values are multiplied into the value of {@link Field#getBoost()} of
> +   * <p>Values are multiplied into the value of {@link Fieldable#getBoost()} of
>     * each field in this document.  Thus, this method in effect sets a default
>     * boost for the fields of this document.
>     *
> -   * @see Field#setBoost(float)
> +   * @see Fieldable#setBoost(float)
>     */
>    public void setBoost(float boost) {
>      this.boost = boost;
> @@ -85,7 +82,7 @@
>     * a document has to be deleted from an index and a new changed version of that
>     * document has to be added.</p>
>     */
> -  public final void add(Field field) {
> +  public final void add(Fieldable field) {
>      fields.add(field);
>    }
>    
> @@ -102,7 +99,7 @@
>    public final void removeField(String name) {
>      Iterator it = fields.iterator();
>      while (it.hasNext()) {
> -      Field field = (Field)it.next();
> +      Fieldable field = (Fieldable)it.next();
>        if (field.name().equals(name)) {
>          it.remove();
>          return;
> @@ -122,7 +119,7 @@
>    public final void removeFields(String name) {
>      Iterator it = fields.iterator();
>      while (it.hasNext()) {
> -      Field field = (Field)it.next();
> +      Fieldable field = (Fieldable)it.next();
>        if (field.name().equals(name)) {
>          it.remove();
>        }
> @@ -133,9 +130,9 @@
>     * null.  If multiple fields exists with this name, this method returns the
>     * first value added.
>     */
> -  public final Field getField(String name) {
> +  public final Fieldable getField(String name) {
>      for (int i = 0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        if (field.name().equals(name))
>      return field;
>      }
> @@ -149,7 +146,7 @@
>     */
>    public final String get(String name) {
>      for (int i = 0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        if (field.name().equals(name) && (!field.isBinary()))
>          return field.stringValue();
>      }
> @@ -162,16 +159,16 @@
>    }
>  
>    /**
> -   * Returns an array of {@link Field}s with the given name.
> +   * Returns an array of {@link Fieldable}s with the given name.
>     * This method can return <code>null</code>.
>     *
>     * @param name the name of the field
> -   * @return a <code>Field[]</code> array
> +   * @return a <code>Fieldable[]</code> array
>     */
> -   public final Field[] getFields(String name) {
> +   public final Fieldable[] getFields(String name) {
>       List result = new ArrayList();
>       for (int i = 0; i < fields.size(); i++) {
> -       Field field = (Field)fields.get(i);
> +       Fieldable field = (Fieldable)fields.get(i);
>         if (field.name().equals(name)) {
>           result.add(field);
>         }
> @@ -180,7 +177,7 @@
>       if (result.size() == 0)
>         return null;
>  
> -     return (Field[])result.toArray(new Field[result.size()]);
> +     return (Fieldable[])result.toArray(new Fieldable[result.size()]);
>     }
>  
>    /**
> @@ -193,7 +190,7 @@
>    public final String[] getValues(String name) {
>      List result = new ArrayList();
>      for (int i = 0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        if (field.name().equals(name) && (!field.isBinary()))
>          result.add(field.stringValue());
>      }
> @@ -215,7 +212,7 @@
>    public final byte[][] getBinaryValues(String name) {
>      List result = new ArrayList();
>      for (int i = 0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        if (field.name().equals(name) && (field.isBinary()))
>          result.add(field.binaryValue());
>      }
> @@ -237,7 +234,7 @@
>    */
>    public final byte[] getBinaryValue(String name) {
>      for (int i=0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        if (field.name().equals(name) && (field.isBinary()))
>          return field.binaryValue();
>      }
> @@ -249,7 +246,7 @@
>      StringBuffer buffer = new StringBuffer();
>      buffer.append("Document<");
>      for (int i = 0; i < fields.size(); i++) {
> -      Field field = (Field)fields.get(i);
> +      Fieldable field = (Fieldable)fields.get(i);
>        buffer.append(field.toString());
>        if (i != fields.size()-1)
>          buffer.append(" ");
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Field.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Field.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Field.java Fri Jun  9 18:23:22 2006
> @@ -16,9 +16,6 @@
>   * limitations under the License.
>   */
>  
> -import org.apache.lucene.index.IndexReader;
> -import org.apache.lucene.search.Hits;
> -import org.apache.lucene.search.Similarity;
>  import org.apache.lucene.util.Parameter;
>  
>  import java.io.Reader;
> @@ -32,23 +29,7 @@
>    index, so that they may be returned with hits on the document.
>    */
>  
> -public final class Field implements Serializable {
> -  private String name = "body";
> -  
> -  // the one and only data object for all different kind of field values
> -  private Object fieldsData = null;
> -  
> -  private boolean storeTermVector = false;
> -  private boolean storeOffsetWithTermVector = false; 
> -  private boolean storePositionWithTermVector = false;
> -  private boolean omitNorms = false;
> -  private boolean isStored = false;
> -  private boolean isIndexed = true;
> -  private boolean isTokenized = true;
> -  private boolean isBinary = false;
> -  private boolean isCompressed = false;
> -  
> -  private float boost = 1.0f;
> +public final class Field extends AbstractField implements Fieldable, Serializable {
>    
>    /** Specifies whether and how a field should be stored. */
>    public static final class Store extends Parameter implements Serializable {
> @@ -146,45 +127,7 @@
>      public static final TermVector WITH_POSITIONS_OFFSETS = new TermVector("WITH_POSITIONS_OFFSETS");
>    }
>    
> -  /** Sets the boost factor hits on this field.  This value will be
> -   * multiplied into the score of all hits on this this field of this
> -   * document.
> -   *
> -   * <p>The boost is multiplied by {@link Document#getBoost()} of the document
> -   * containing this field.  If a document has multiple fields with the same
> -   * name, all such values are multiplied together.  This product is then
> -   * multipled by the value {@link Similarity#lengthNorm(String,int)}, and
> -   * rounded by {@link Similarity#encodeNorm(float)} before it is stored in the
> -   * index.  One should attempt to ensure that this product does not overflow
> -   * the range of that encoding.
> -   *
> -   * @see Document#setBoost(float)
> -   * @see Similarity#lengthNorm(String, int)
> -   * @see Similarity#encodeNorm(float)
> -   */
> -  public void setBoost(float boost) {
> -    this.boost = boost;
> -  }
> -
> -  /** Returns the boost factor for hits for this field.
> -   *
> -   * <p>The default value is 1.0.
> -   *
> -   * <p>Note: this value is not stored directly with the document in the index.
> -   * Documents returned from {@link IndexReader#document(int)} and
> -   * {@link Hits#doc(int)} may thus not have the same value present as when
> -   * this field was indexed.
> -   *
> -   * @see #setBoost(float)
> -   */
> -  public float getBoost() {
> -    return boost;
> -  }
> -  /** Returns the name of the field as an interned string.
> -   * For example "date", "title", "body", ...
> -   */
> -  public String name()    { return name; }
> -
> +  
>    /** The value of the field as a String, or null.  If null, the Reader value
>     * or binary value is used.  Exactly one of stringValue(), readerValue(), and
>     * binaryValue() must be set. */
> @@ -365,146 +308,6 @@
>      
>      setStoreTermVector(TermVector.NO);
>    }
> -  
> -  private void setStoreTermVector(TermVector termVector) {
> -    if (termVector == TermVector.NO) {
> -      this.storeTermVector = false;
> -      this.storePositionWithTermVector = false;
> -      this.storeOffsetWithTermVector = false;
> -    } 
> -    else if (termVector == TermVector.YES) {
> -      this.storeTermVector = true;
> -      this.storePositionWithTermVector = false;
> -      this.storeOffsetWithTermVector = false;
> -    }
> -    else if (termVector == TermVector.WITH_POSITIONS) {
> -      this.storeTermVector = true;
> -      this.storePositionWithTermVector = true;
> -      this.storeOffsetWithTermVector = false;
> -    } 
> -    else if (termVector == TermVector.WITH_OFFSETS) {
> -      this.storeTermVector = true;
> -      this.storePositionWithTermVector = false;
> -      this.storeOffsetWithTermVector = true;
> -    } 
> -    else if (termVector == TermVector.WITH_POSITIONS_OFFSETS) {
> -      this.storeTermVector = true;
> -      this.storePositionWithTermVector = true;
> -      this.storeOffsetWithTermVector = true;
> -    } 
> -    else {
> -      throw new IllegalArgumentException("unknown termVector parameter " + termVector);
> -    }
> -  }
> -  
> -  /** True iff the value of the field is to be stored in the index for return
> -    with search hits.  It is an error for this to be true if a field is
> -    Reader-valued. */
> -  public final boolean  isStored()  { return isStored; }
> -
> -  /** True iff the value of the field is to be indexed, so that it may be
> -    searched on. */
> -  public final boolean  isIndexed()   { return isIndexed; }
> -
> -  /** True iff the value of the field should be tokenized as text prior to
> -    indexing.  Un-tokenized fields are indexed as a single word and may not be
> -    Reader-valued. */
> -  public final boolean  isTokenized()   { return isTokenized; }
> -  
> -  /** True if the value of the field is stored and compressed within the index */
> -  public final boolean  isCompressed()   { return isCompressed; }
>  
> -  /** True iff the term or terms used to index this field are stored as a term
> -   *  vector, available from {@link IndexReader#getTermFreqVector(int,String)}.
> -   *  These methods do not provide access to the original content of the field,
> -   *  only to terms used to index it. If the original content must be
> -   *  preserved, use the <code>stored</code> attribute instead.
> -   *
> -   * @see IndexReader#getTermFreqVector(int, String)
> -   */
> -  public final boolean isTermVectorStored() { return storeTermVector; }
> -  
> -  /**
> -   * True iff terms are stored as term vector together with their offsets 
> -   * (start and end positon in source text).
> -   */
> -  public boolean isStoreOffsetWithTermVector(){ 
> -    return storeOffsetWithTermVector; 
> -  } 
> -  
> -  /**
> -   * True iff terms are stored as term vector together with their token positions.
> -   */
> -  public boolean isStorePositionWithTermVector(){ 
> -    return storePositionWithTermVector; 
> -  }
> -      
> -  /** True iff the value of the filed is stored as binary */
> -  public final boolean  isBinary()      { return isBinary; }
> -  
> -  /** True if norms are omitted for this indexed field */
> -  public boolean getOmitNorms() { return omitNorms; }
> -
> -  /** Expert:
> -   *
> -   * If set, omit normalization factors associated with this indexed field.
> -   * This effectively disables indexing boosts and length normalization for this field.
> -   */
> -  public void setOmitNorms(boolean omitNorms) { this.omitNorms=omitNorms; }
> -  
> -  /** Prints a Field for human consumption. */
> -  public final String toString() {
> -    StringBuffer result = new StringBuffer();
> -    if (isStored) {
> -      result.append("stored");
> -      if (isCompressed)
> -        result.append("/compressed");
> -      else
> -        result.append("/uncompressed");
> -    }
> -    if (isIndexed) {
> -      if (result.length() > 0)
> -        result.append(",");
> -      result.append("indexed");
> -    }
> -    if (isTokenized) {
> -      if (result.length() > 0)
> -        result.append(",");
> -      result.append("tokenized");
> -    }
> -    if (storeTermVector) {
> -      if (result.length() > 0)
> -        result.append(",");
> -      result.append("termVector");
> -    }
> -    if (storeOffsetWithTermVector) { 
> -      if (result.length() > 0) 
> -        result.append(","); 
> -      result.append("termVectorOffsets"); 
> -    } 
> -    if (storePositionWithTermVector) { 
> -      if (result.length() > 0) 
> -        result.append(","); 
> -      result.append("termVectorPosition"); 
> -    } 
> -    if (isBinary) {
> -      if (result.length() > 0)
> -        result.append(",");
> -      result.append("binary");
> -    }
> -    if (omitNorms) {
> -      result.append(",omitNorms");
> -    }
> -    result.append('<');
> -    result.append(name);
> -    result.append(':');
> -    
> -    if (fieldsData != null) {
> -      result.append(fieldsData);
> -    }
> -    
> -    result.append('>');
> -    return result.toString();
> -  }
>  
>  }
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,24 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 5:29:26 PM
> + * $Id:$
> + * Copyright 2005.  Center For Natural Language Processing
> + */
> +
> +/**
> + * Similar to a {@link java.io.FileFilter}, the FieldSelector allows one to make decisions about
> + * what Fields get loaded on a {@link Document} by {@link org.apache.lucene.index.IndexReader#document(int,org.apache.lucene.document.FieldSelector)}
> + *
> + **/
> +public interface FieldSelector {
> +
> +  /**
> +   * 
> +   * @param fieldName
> +   * @return true if the {@link Field} with <code>fieldName</code> should be loaded or not
> +   */
> +  FieldSelectorResult accept(String fieldName);
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,44 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 5:40:17 PM
> + * $Id:$
> + * Copyright 2005.  Center For Natural Language Processing
> + */
> +
> +/**
> + *  Provides information about what should be done with this Field 
> + *
> + **/
> +//Replace with an enumerated type in 1.5
> +public final class FieldSelectorResult {
> +
> +  public static final FieldSelectorResult LOAD = new FieldSelectorResult(0);
> +  public static final FieldSelectorResult LAZY_LOAD = new FieldSelectorResult(1);
> +  public static final FieldSelectorResult NO_LOAD = new FieldSelectorResult(2);
> +  public static final FieldSelectorResult LOAD_AND_BREAK = new FieldSelectorResult(3);
> +  
> +  private int id;
> +
> +  private FieldSelectorResult(int id)
> +  {
> +    this.id = id;
> +  }
> +
> +  public boolean equals(Object o) {
> +    if (this == o) return true;
> +    if (o == null || getClass() != o.getClass()) return false;
> +
> +    final FieldSelectorResult that = (FieldSelectorResult) o;
> +
> +    if (id != that.id) return false;
> +
> +    return true;
> +  }
> +
> +  public int hashCode() {
> +    return id;
> +  }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,137 @@
> +package org.apache.lucene.document;
> +
> +/**
> + * Copyright 2004 The Apache Software Foundation
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + *     http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +import java.io.Reader;
> +import java.io.Serializable;
> +
> +/**
> + * Synonymous with {@link Field}.
> + *
> + **/
> +public interface Fieldable extends Serializable {
> +  /** Sets the boost factor hits on this field.  This value will be
> +   * multiplied into the score of all hits on this this field of this
> +   * document.
> +   *
> +   * <p>The boost is multiplied by {@link org.apache.lucene.document.Document#getBoost()} of the document
> +   * containing this field.  If a document has multiple fields with the same
> +   * name, all such values are multiplied together.  This product is then
> +   * multipled by the value {@link org.apache.lucene.search.Similarity#lengthNorm(String,int)}, and
> +   * rounded by {@link org.apache.lucene.search.Similarity#encodeNorm(float)} before it is stored in the
> +   * index.  One should attempt to ensure that this product does not overflow
> +   * the range of that encoding.
> +   *
> +   * @see org.apache.lucene.document.Document#setBoost(float)
> +   * @see org.apache.lucene.search.Similarity#lengthNorm(String, int)
> +   * @see org.apache.lucene.search.Similarity#encodeNorm(float)
> +   */
> +  void setBoost(float boost);
> +
> +  /** Returns the boost factor for hits for this field.
> +   *
> +   * <p>The default value is 1.0.
> +   *
> +   * <p>Note: this value is not stored directly with the document in the index.
> +   * Documents returned from {@link org.apache.lucene.index.IndexReader#document(int)} and
> +   * {@link org.apache.lucene.search.Hits#doc(int)} may thus not have the same value present as when
> +   * this field was indexed.
> +   *
> +   * @see #setBoost(float)
> +   */
> +  float getBoost();
> +
> +  /** Returns the name of the field as an interned string.
> +   * For example "date", "title", "body", ...
> +   */
> +  String name();
> +
> +  /** The value of the field as a String, or null.  If null, the Reader value
> +   * or binary value is used.  Exactly one of stringValue(), readerValue(), and
> +   * binaryValue() must be set. */
> +  String stringValue();
> +
> +  /** The value of the field as a Reader, or null.  If null, the String value
> +   * or binary value is  used.  Exactly one of stringValue(), readerValue(),
> +   * and binaryValue() must be set. */
> +  Reader readerValue();
> +
> +  /** The value of the field in Binary, or null.  If null, the Reader or
> +   * String value is used.  Exactly one of stringValue(), readerValue() and
> +   * binaryValue() must be set. */
> +  byte[] binaryValue();
> +
> +  /** True iff the value of the field is to be stored in the index for return
> +    with search hits.  It is an error for this to be true if a field is
> +    Reader-valued. */
> +  boolean  isStored();
> +
> +  /** True iff the value of the field is to be indexed, so that it may be
> +    searched on. */
> +  boolean  isIndexed();
> +
> +  /** True iff the value of the field should be tokenized as text prior to
> +    indexing.  Un-tokenized fields are indexed as a single word and may not be
> +    Reader-valued. */
> +  boolean  isTokenized();
> +
> +  /** True if the value of the field is stored and compressed within the index */
> +  boolean  isCompressed();
> +
> +  /** True iff the term or terms used to index this field are stored as a term
> +   *  vector, available from {@link org.apache.lucene.index.IndexReader#getTermFreqVector(int,String)}.
> +   *  These methods do not provide access to the original content of the field,
> +   *  only to terms used to index it. If the original content must be
> +   *  preserved, use the <code>stored</code> attribute instead.
> +   *
> +   * @see org.apache.lucene.index.IndexReader#getTermFreqVector(int, String)
> +   */
> +  boolean isTermVectorStored();
> +
> +  /**
> +   * True iff terms are stored as term vector together with their offsets 
> +   * (start and end positon in source text).
> +   */
> +  boolean isStoreOffsetWithTermVector();
> +
> +  /**
> +   * True iff terms are stored as term vector together with their token positions.
> +   */
> +  boolean isStorePositionWithTermVector();
> +
> +  /** True iff the value of the filed is stored as binary */
> +  boolean  isBinary();
> +
> +  /** True if norms are omitted for this indexed field */
> +  boolean getOmitNorms();
> +
> +  /** Expert:
> +   *
> +   * If set, omit normalization factors associated with this indexed field.
> +   * This effectively disables indexing boosts and length normalization for this field.
> +   */
> +  void setOmitNorms(boolean omitNorms);
> +
> +  /**
> +   * Indicates whether a Field is Lazy or not.  The semantics of Lazy loading are such that if a Field is lazily loaded, retrieving
> +   * it's values via {@link #stringValue()} or {@link #binaryValue()} is only valid as long as the {@link org.apache.lucene.index.IndexReader} that
> +   * retrieved the {@link Document} is still open.
> +   *  
> +   * @return true if this field can be loaded lazily
> +   */
> +  boolean isLazy();
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,22 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 15, 2006
> + * Time: 10:13:07 AM
> + * $Id:$
> + * Copyright 2005.  Center For Natural Language Processing
> + */
> +
> +
> +/**
> + * Load the First field and break.
> + * <p/>
> + * See {@link FieldSelectorResult#LOAD_AND_BREAK}
> + */
> +public class LoadFirstFieldSelector implements FieldSelector {
> +
> +  public FieldSelectorResult accept(String fieldName) {
> +    return FieldSelectorResult.LOAD_AND_BREAK;
> +  }
> +}
> \ No newline at end of file
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,57 @@
> +/*
> + * MapFieldSelector.java
> + *
> + * Created on May 2, 2006, 6:49 PM
> + *
> + */
> +
> +package org.apache.lucene.document;
> +
> +import java.util.HashMap;
> +import java.util.List;
> +import java.util.Map;
> +
> +/**
> + * A FieldSelector based on a Map of field names to FieldSelectorResults
> + *
> + * @author Chuck Williams
> + */
> +public class MapFieldSelector implements FieldSelector {
> +    
> +    Map fieldSelections;
> +    
> +    /** Create a a MapFieldSelector
> +     * @param fieldSelections maps from field names to FieldSelectorResults
> +     */
> +    public MapFieldSelector(Map fieldSelections) {
> +        this.fieldSelections = fieldSelections;
> +    }
> +    
> +    /** Create a a MapFieldSelector
> +     * @param fields fields to LOAD.  All other fields are NO_LOAD.
> +     */
> +    public MapFieldSelector(List fields) {
> +        fieldSelections = new HashMap(fields.size()*5/3);
> +        for (int i=0; i<fields.size(); i++)
> +            fieldSelections.put(fields.get(i), FieldSelectorResult.LOAD);
> +    }
> +    
> +    /** Create a a MapFieldSelector
> +     * @param fields fields to LOAD.  All other fields are NO_LOAD.
> +     */
> +    public MapFieldSelector(String[] fields) {
> +        fieldSelections = new HashMap(fields.length*5/3);
> +        for (int i=0; i<fields.length; i++)
> +            fieldSelections.put(fields[i], FieldSelectorResult.LOAD);
> +    }
> +    
> +    /** Load field according to its associated value in fieldSelections
> +     * @param field a field name
> +     * @return the fieldSelections value that field maps to or NO_LOAD if none.
> +     */
> +    public FieldSelectorResult accept(String field) {
> +        FieldSelectorResult selection = (FieldSelectorResult) fieldSelections.get(field);
> +        return selection!=null ? selection : FieldSelectorResult.NO_LOAD;
> +    }
> +    
> +}
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,53 @@
> +package org.apache.lucene.document;
> +
> +import java.util.Set;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 6:53:07 PM
> + * $Id:$
> + * Copyright 2005.  Center For Natural Language Processing
> + */
> +
> +/**
> + * Declare what fields to load normally and what fields to load lazily
> + *
> + **/
> +public class SetBasedFieldSelector implements FieldSelector {
> +  
> +  private Set fieldsToLoad;
> +  private Set lazyFieldsToLoad;
> +  
> +  
> +
> +  /**
> +   * Pass in the Set of {@link Field} names to load and the Set of {@link Field} names to load lazily.  If both are null, the
> +   * Document will not have any {@link Field} on it.  
> +   * @param fieldsToLoad A Set of {@link String} field names to load.  May be empty, but not null
> +   * @param lazyFieldsToLoad A Set of {@link String} field names to load lazily.  May be empty, but not null  
> +   */
> +  public SetBasedFieldSelector(Set fieldsToLoad, Set lazyFieldsToLoad) {
> +    this.fieldsToLoad = fieldsToLoad;
> +    this.lazyFieldsToLoad = lazyFieldsToLoad;
> +  }
> +
> +  /**
> +   * Indicate whether to load the field with the given name or not. If the {@link Field#name()} is not in either of the 
> +   * initializing Sets, then {@link org.apache.lucene.document.FieldSelectorResult#NO_LOAD} is returned.  If a Field name
> +   * is in both <code>fieldsToLoad</code> and <code>lazyFieldsToLoad</code>, lazy has precedence.
> +   * 
> +   * @param fieldName The {@link Field} name to check
> +   * @return The {@link FieldSelectorResult}
> +   */
> +  public FieldSelectorResult accept(String fieldName) {
> +    FieldSelectorResult result = FieldSelectorResult.NO_LOAD;
> +    if (fieldsToLoad.contains(fieldName) == true){
> +      result = FieldSelectorResult.LOAD;
> +    }
> +    if (lazyFieldsToLoad.contains(fieldName) == true){
> +      result = FieldSelectorResult.LAZY_LOAD;
> +    }                                           
> +    return result;
> +  }
> +}
> \ No newline at end of file
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java Fri Jun  9 18:23:22 2006
> @@ -16,22 +16,22 @@
>   * limitations under the License.
>   */
>  
> +import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.analysis.Token;
> +import org.apache.lucene.analysis.TokenStream;
> +import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.Fieldable;
> +import org.apache.lucene.search.Similarity;
> +import org.apache.lucene.store.Directory;
> +import org.apache.lucene.store.IndexOutput;
> +
>  import java.io.IOException;
>  import java.io.PrintStream;
>  import java.io.Reader;
>  import java.io.StringReader;
> -import java.util.Hashtable;
> -import java.util.Enumeration;
>  import java.util.Arrays;
> -
> -import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -import org.apache.lucene.analysis.Analyzer;
> -import org.apache.lucene.analysis.TokenStream;
> -import org.apache.lucene.analysis.Token;
> -import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexOutput;
> -import org.apache.lucene.search.Similarity;
> +import java.util.Enumeration;
> +import java.util.Hashtable;
>  
>  final class DocumentWriter {
>    private Analyzer analyzer;
> @@ -129,7 +129,7 @@
>            throws IOException {
>      Enumeration fields = doc.fields();
>      while (fields.hasMoreElements()) {
> -      Field field = (Field) fields.nextElement();
> +      Fieldable field = (Fieldable) fields.nextElement();
>        String fieldName = field.name();
>        int fieldNumber = fieldInfos.fieldNumber(fieldName);
>  
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java Fri Jun  9 18:23:22 2006
> @@ -16,18 +16,17 @@
>   * limitations under the License.
>   */
>  
> -import java.util.*;
> -import java.io.IOException;
> -
>  import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -
> +import org.apache.lucene.document.Fieldable;
>  import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexOutput;
>  import org.apache.lucene.store.IndexInput;
> +import org.apache.lucene.store.IndexOutput;
> +
> +import java.io.IOException;
> +import java.util.*;
>  
> -/** Access to the Field Info file that describes document fields and whether or
> - *  not they are indexed. Each segment has a separate Field Info file. Objects
> +/** Access to the Fieldable Info file that describes document fields and whether or
> + *  not they are indexed. Each segment has a separate Fieldable Info file. Objects
>   *  of this class are thread-safe for multiple readers, but only one thread can
>   *  be adding documents at a time, with no other reader or writer threads
>   *  accessing this object.
> @@ -65,7 +64,7 @@
>    public void add(Document doc) {
>      Enumeration fields = doc.fields();
>      while (fields.hasMoreElements()) {
> -      Field field = (Field) fields.nextElement();
> +      Fieldable field = (Fieldable) fields.nextElement();
>        add(field.name(), field.isIndexed(), field.isTermVectorStored(), field.isStorePositionWithTermVector(),
>                field.isStoreOffsetWithTermVector(), field.getOmitNorms());
>      }
> @@ -105,7 +104,7 @@
>    /**
>     * Calls 5 parameter add with false for all TermVector parameters.
>     * 
> -   * @param name The name of the Field
> +   * @param name The name of the Fieldable
>     * @param isIndexed true if the field is indexed
>     * @see #add(String, boolean, boolean, boolean, boolean)
>     */
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java Fri Jun  9 18:23:22 2006
> @@ -0,0 +1,70 @@
> +package org.apache.lucene.index;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Jan 12, 2006
> + * Time: 9:37:43 AM
> + * $Id:$
> + * Copyright 2005.  Center For Natural Language Processing
> + */
> +
> +/**
> + *
> + *
> + **/
> +public class FieldReaderException extends RuntimeException{
> +  /**
> +   * Constructs a new runtime exception with <code>null</code> as its
> +   * detail message.  The cause is not initialized, and may subsequently be
> +   * initialized by a call to {@link #initCause}.
> +   */
> +  public FieldReaderException() {
> +  }
> +
> +  /**
> +   * Constructs a new runtime exception with the specified cause and a
> +   * detail message of <tt>(cause==null ? null : cause.toString())</tt>
> +   * (which typically contains the class and detail message of
> +   * <tt>cause</tt>).  This constructor is useful for runtime exceptions
> +   * that are little more than wrappers for other throwables.
> +   *
> +   * @param cause the cause (which is saved for later retrieval by the
> +   *              {@link #getCause()} method).  (A <tt>null</tt> value is
> +   *              permitted, and indicates that the cause is nonexistent or
> +   *              unknown.)
> +   * @since 1.4
> +   */
> +  public FieldReaderException(Throwable cause) {
> +    super(cause);
> +  }
> +
> +  /**
> +   * Constructs a new runtime exception with the specified detail message.
> +   * The cause is not initialized, and may subsequently be initialized by a
> +   * call to {@link #initCause}.
> +   *
> +   * @param message the detail message. The detail message is saved for
> +   *                later retrieval by the {@link #getMessage()} method.
> +   */
> +  public FieldReaderException(String message) {
> +    super(message);
> +  }
> +
> +  /**
> +   * Constructs a new runtime exception with the specified detail message and
> +   * cause.  <p>Note that the detail message associated with
> +   * <code>cause</code> is <i>not</i> automatically incorporated in
> +   * this runtime exception's detail message.
> +   *
> +   * @param message the detail message (which is saved for later retrieval
> +   *                by the {@link #getMessage()} method).
> +   * @param cause   the cause (which is saved for later retrieval by the
> +   *                {@link #getCause()} method).  (A <tt>null</tt> value is
> +   *                permitted, and indicates that the cause is nonexistent or
> +   *                unknown.)
> +   * @since 1.4
> +   */
> +  public FieldReaderException(String message, Throwable cause) {
> +    super(message, cause);
> +  }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java
> ------------------------------------------------------------------------------
>     svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java Fri Jun  9 18:23:22 2006
> @@ -16,19 +16,19 @@
>   * limitations under the License.
>   */
>  
> +import org.apache.lucene.document.*;
> +import org.apache.lucene.store.Directory;
> +import org.apache.lucene.store.IndexInput;
> +
>  import java.io.ByteArrayOutputStream;
>  import java.io.IOException;
> +import java.io.Reader;
>  import java.util.zip.DataFormatException;
>  import java.util.zip.Inflater;
>  
> -import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexInput;
> -
>  /**
>   * Class responsible for access to stored document fields.
> - *
> + * <p/>
>   * It uses &lt;segment&gt;.fdt and &lt;segment&gt;.fdx; files.
>   *
>   * @version $Id$
> @@ -39,25 +39,37 @@
>    private IndexInput indexStream;
>    private int size;
>  
> +  private static ThreadLocal fieldsStreamTL = new ThreadLocal();
> +
>    FieldsReader(Directory d, String segment, FieldInfos fn) throws IOException {
>      fieldInfos = fn;
>  
>      fieldsStream = d.openInput(segment + ".fdt");
>      indexStream = d.openInput(segment + ".fdx");
> -
> -    size = (int)(indexStream.length() / 8);
> +    size = (int) (indexStream.length() / 8);
>    }
>  
> +  /**
> +   * Cloeses the underlying {@link org.apache.lucene.store.IndexInput} streams, including any ones associated with a
> +   * lazy implementation of a Field.  This means that the Fields values will not be accessible.
> +   *
> +   * @throws IOException
> +   */
>    final void close() throws IOException {
>      fieldsStream.close();
>      indexStream.close();
> +    IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> +    if (localFieldsStream != null) {
> +      localFieldsStream.close();
> +      fieldsStreamTL.set(null);
> +    }
>    }
>  
>    final int size() {
>      return size;
>    }
>  
> -  final Document doc(int n) throws IOException {
> +  final Document doc(int n, FieldSelector fieldSelector) throws IOException {
>      indexStream.seek(n * 8L);
>      long position = indexStream.readLong();
>      fieldsStream.seek(position);
> @@ -67,89 +79,277 @@
>      for (int i = 0; i < numFields; i++) {
>        int fieldNumber = fieldsStream.readVInt();
>        FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);
> -
> -      byte bits = fieldsStream.readByte();
> +      FieldSelectorResult acceptField = fieldSelector == null ? FieldSelectorResult.LOAD : fieldSelector.accept(fi.name);
> +      boolean lazy = acceptField.equals(FieldSelectorResult.LAZY_LOAD) == true;
>        
> +      byte bits = fieldsStream.readByte();
>        boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
>        boolean tokenize = (bits & FieldsWriter.FIELD_IS_TOKENIZED) != 0;
> -      
> -      if ((bits & FieldsWriter.FIELD_IS_BINARY) != 0) {
> -        final byte[] b = new byte[fieldsStream.readVInt()];
> -        fieldsStream.readBytes(b, 0, b.length);
> -        if (compressed)
> -          doc.add(new Field(fi.name, uncompress(b), Field.Store.COMPRESS));
> -        else
> -          doc.add(new Field(fi.name, b, Field.Store.YES));
> +      boolean binary = (bits & FieldsWriter.FIELD_IS_BINARY) != 0;
> +      if (acceptField.equals(FieldSelectorResult.LOAD) == true) {
> +        addField(doc, fi, binary, compressed, tokenize);
>        }
> +      else if (acceptField.equals(FieldSelectorResult.LOAD_AND_BREAK) == true){
> +        addField(doc, fi, binary, compressed, tokenize);
> +        break;//Get out of this loop
> +      }
> +      else if (lazy == true){
> +        addFieldLazy(doc, fi, binary, compressed, tokenize);
> +      }       
>        else {
> -        Field.Index index;
> -        Field.Store store = Field.Store.YES;
> -        
> -        if (fi.isIndexed && tokenize)
> -          index = Field.Index.TOKENIZED;
> -        else if (fi.isIndexed && !tokenize)
> -          index = Field.Index.UN_TOKENIZED;
> -        else
> -          index = Field.Index.NO;
> -        
> -        Field.TermVector termVector = null;
> -        if (fi.storeTermVector) {
> -          if (fi.storeOffsetWithTermVector) {
> -            if (fi.storePositionWithTermVector) {
> -              termVector = Field.TermVector.WITH_POSITIONS_OFFSETS;
> -            }
> -            else {
> -              termVector = Field.TermVector.WITH_OFFSETS;
> -            }
> -          }
> -          else if (fi.storePositionWithTermVector) {
> -            termVector = Field.TermVector.WITH_POSITIONS;
> -          }
> -          else {
> -            termVector = Field.TermVector.YES;
> -          }
> -        }
> -        else {
> -          termVector = Field.TermVector.NO;
> -        }
> -        
> -        if (compressed) {
> -          store = Field.Store.COMPRESS;
> -          final byte[] b = new byte[fieldsStream.readVInt()];
> -          fieldsStream.readBytes(b, 0, b.length);
> -          Field f = new Field(fi.name,      // field name
> -              new String(uncompress(b), "UTF-8"), // uncompress the value and add as string
> -              store,
> -              index,
> -              termVector);
> -          f.setOmitNorms(fi.omitNorms);
> -          doc.add(f);
> -        }
> -        else {
> -          Field f = new Field(fi.name,     // name
> +        skipField(binary, compressed);
> +      }
> +    }
> +
> +    return doc;
> +  }
> +
> +  /**
> +   * Skip the field.  We still have to read some of the information about the field, but can skip past the actual content.
> +   * This will have the most payoff on large fields.
> +   */
> +  private void skipField(boolean binary, boolean compressed) throws IOException {
> +
> +    int toRead = fieldsStream.readVInt();
> +
> +    if (binary || compressed) {
> +      long pointer = fieldsStream.getFilePointer();
> +      fieldsStream.seek(pointer + toRead);
> +    } else {
> +      //We need to skip chars.  This will slow us down, but still better
> +      fieldsStream.skipChars(toRead);
> +    }
> +  }
> +
> +  private void addFieldLazy(Document doc, FieldInfo fi, boolean binary, boolean compressed, boolean tokenize) throws IOException {
> +    if (binary == true) {
> +      int toRead = fieldsStream.readVInt();
> +      long pointer = fieldsStream.getFilePointer();
> +      if (compressed) {
> +        //was: doc.add(new Fieldable(fi.name, uncompress(b), Fieldable.Store.COMPRESS));
> +        doc.add(new LazyField(fi.name, Field.Store.COMPRESS, toRead, pointer));
> +      } else {
> +        //was: doc.add(new Fieldable(fi.name, b, Fieldable.Store.YES));
> +        doc.add(new LazyField(fi.name, Field.Store.YES, toRead, pointer));
> +      }
> +      //Need to move the pointer ahead by toRead positions
> +      fieldsStream.seek(pointer + toRead);
> +    } else {
> +      Field.Store store = Field.Store.YES;
> +      Field.Index index = getIndexType(fi, tokenize);
> +      Field.TermVector termVector = getTermVectorType(fi);
> +
> +      Fieldable f;
> +      if (compressed) {
> +        store = Field.Store.COMPRESS;
> +        int toRead = fieldsStream.readVInt();
> +        long pointer = fieldsStream.getFilePointer();
> +        f = new LazyField(fi.name, store, toRead, pointer);
> +        //skip over the part that we aren't loading
> +        fieldsStream.seek(pointer + toRead);
> +        f.setOmitNorms(fi.omitNorms);
> +      } else {
> +        int length = fieldsStream.readVInt();
> +        long pointer = fieldsStream.getFilePointer();
> +        //Skip ahead of where we are by the length of what is stored
> +        fieldsStream.skipChars(length);
> +        f = new LazyField(fi.name, store, index, termVector, length, pointer);
> +        f.setOmitNorms(fi.omitNorms);
> +      }
> +      doc.add(f);
> +    }
> +
> +  }
> +
> +  private void addField(Document doc, FieldInfo fi, boolean binary, boolean compressed, boolean tokenize) throws IOException {
> +
> +    //we have a binary stored field, and it may be compressed
> +    if (binary) {
> +      int toRead = fieldsStream.readVInt();
> +      final byte[] b = new byte[toRead];
> +      fieldsStream.readBytes(b, 0, b.length);
> +      if (compressed)
> +        doc.add(new Field(fi.name, uncompress(b), Field.Store.COMPRESS));
> +      else
> +        doc.add(new Field(fi.name, b, Field.Store.YES));
> +
> +    } else {
> +      Field.Store store = Field.Store.YES;
> +      Field.Index index = getIndexType(fi, tokenize);
> +      Field.TermVector termVector = getTermVectorType(fi);
> +
> +      Fieldable f;
> +      if (compressed) {
> +        store = Field.Store.COMPRESS;
> +        int toRead = fieldsStream.readVInt();
> +
> +        final byte[] b = new byte[toRead];
> +        fieldsStream.readBytes(b, 0, b.length);
> +        f = new Field(fi.name,      // field name
> +                new String(uncompress(b), "UTF-8"), // uncompress the value and add as string
> +                store,
> +                index,
> +                termVector);
> +        f.setOmitNorms(fi.omitNorms);
> +      } else {
> +        f = new Field(fi.name,     // name
>                  fieldsStream.readString(), // read value
>                  store,
>                  index,
>                  termVector);
> -          f.setOmitNorms(fi.omitNorms);
> -          doc.add(f);
> +        f.setOmitNorms(fi.omitNorms);
> +      }
> +      doc.add(f);
> +    }
> +  }
> +
> +  private Field.TermVector getTermVectorType(FieldInfo fi) {
> +    Field.TermVector termVector = null;
> +    if (fi.storeTermVector) {
> +      if (fi.storeOffsetWithTermVector) {
> +        if (fi.storePositionWithTermVector) {
> +          termVector = Field.TermVector.WITH_POSITIONS_OFFSETS;
> +        } else {
> +          termVector = Field.TermVector.WITH_OFFSETS;
>          }
> +      } else if (fi.storePositionWithTermVector) {
> +        termVector = Field.TermVector.WITH_POSITIONS;
> +      } else {
> +        termVector = Field.TermVector.YES;
>        }
> +    } else {
> +      termVector = Field.TermVector.NO;
>      }
> +    return termVector;
> +  }
>  
> -    return doc;
> +  private Field.Index getIndexType(FieldInfo fi, boolean tokenize) {
> +    Field.Index index;
> +    if (fi.isIndexed && tokenize)
> +      index = Field.Index.TOKENIZED;
> +    else if (fi.isIndexed && !tokenize)
> +      index = Field.Index.UN_TOKENIZED;
> +    else
> +      index = Field.Index.NO;
> +    return index;
>    }
> -  
> +
> +  /**
> +   * A Lazy implementation of Fieldable that differs loading of fields until asked for, instead of when the Document is
> +   * loaded.
> +   */
> +  private class LazyField extends AbstractField implements Fieldable {
> +    private int toRead;
> +    private long pointer;
> +    //internal buffer
> +    private char[] chars;
> +
> +
> +    public LazyField(String name, Field.Store store, int toRead, long pointer) {
> +      super(name, store, Field.Index.NO, Field.TermVector.NO);
> +      this.toRead = toRead;
> +      this.pointer = pointer;
> +      lazy = true;
> +    }
> +
> +    public LazyField(String name, Field.Store store, Field.Index index, Field.TermVector termVector, int toRead, long pointer) {
> +      super(name, store, index, termVector);
> +      this.toRead = toRead;
> +      this.pointer = pointer;
> +      lazy = true;
> +    }
> +
> +    /**
> +     * The value of the field in Binary, or null.  If null, the Reader or
> +     * String value is used.  Exactly one of stringValue(), readerValue() and
> +     * binaryValue() must be set.
> +     */
> +    public byte[] binaryValue() {
> +      if (fieldsData == null) {
> +        final byte[] b = new byte[toRead];
> +        IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> +        if (localFieldsStream == null) {
> +          localFieldsStream = (IndexInput) fieldsStream.clone();
> +          fieldsStreamTL.set(localFieldsStream);
> +        }
> +        //Throw this IO Exception since IndexREader.document does so anyway, so probably not that big of a change for people
> +        //since they are already handling this exception when getting the document
> +        try {
> +          localFieldsStream.seek(pointer);
> +          localFieldsStream.readBytes(b, 0, b.length);
> +          if (isCompressed == true) {
> +            fieldsData = uncompress(b);
> +          } else {
> +            fieldsData = b;
> +          }
> +        } catch (IOException e) {
> +          throw new FieldReaderException(e);
> +        }
> +      }
> +      return fieldsData instanceof byte[] ? (byte[]) fieldsData : null;
> +    }
> +
> +    /**
> +     * The value of the field as a Reader, or null.  If null, the String value
> +     * or binary value is  used.  Exactly one of stringValue(), readerValue(),
> +     * and binaryValue() must be set.
> +     */
> +    public Reader readerValue() {
> +      return fieldsData instanceof Reader ? (Reader) fieldsData : null;
> +    }
> +
> +    /**
> +     * The value of the field as a String, or null.  If null, the Reader value
> +     * or binary value is used.  Exactly one of stringValue(), readerValue(), and
> +     * binaryValue() must be set.
> +     */
> +    public String stringValue() {
> +      if (fieldsData == null) {
> +        IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> +        if (localFieldsStream == null) {
> +          localFieldsStream = (IndexInput) fieldsStream.clone();
> +          fieldsStreamTL.set(localFieldsStream);
> +        }
> +        try {
> +          localFieldsStream.seek(pointer);
> +          //read in chars b/c we already know the length we need to read
> +          if (chars == null || toRead > chars.length)
> +            chars = new char[toRead];
> +          localFieldsStream.readChars(chars, 0, toRead);
> +          fieldsData = new String(chars, 0, toRead);//fieldsStream.readString();
> +        } catch (IOException e) {
> +          throw new FieldReaderException(e);
> +        }
> +      }
> +      return fieldsData instanceof String ? (String) fieldsData : null;
> +    }
> +
> +    public long getPointer() {
> +      return pointer;
> +    }
> +
> +    public void setPointer(long pointer) {
> +      this.pointer = pointer;
> +    }
> +
> +    public int getToRead() {
> +      return toRead;
> +    }
> +
> +    public void setToRead(int toRead) {
> +      this.toRead = toRead;
> +    }
> +  }
> +
>    private final byte[] uncompress(final byte[] input)
> -    throws IOException
> -  {
> -  
> +          throws IOException {
> +
>      Inflater decompressor = new Inflater();
>      decompressor.setInput(input);
> -  
> +
>      // Create an expandable byte array to hold the decompressed data
>      ByteArrayOutputStream bos = new ByteArrayOutputStream(input.length);
> -  
> +
>      // Decompress the data
>      byte[] buf = new byte[1024];
>      while (!decompressor.finished()) {
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java Fri Jun  9 18:23:22 2006
> @@ -17,6 +17,8 @@
>   */
>  
>  import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.FieldSelector;
> +
>  
>  import java.io.IOException;
>  import java.util.Collection;
> @@ -100,7 +102,7 @@
>    public int numDocs() { return in.numDocs(); }
>    public int maxDoc() { return in.maxDoc(); }
>  
> -  public Document document(int n) throws IOException { return in.document(n); }
> +  public Document document(int n, FieldSelector fieldSelector) throws IOException { return in.document(n, fieldSelector); }
>  
>    public boolean isDeleted(int n) { return in.isDeleted(n); }
>    public boolean hasDeletions() { return in.hasDeletions(); }
> @@ -133,7 +135,7 @@
>    protected void doCommit() throws IOException { in.commit(); }
>    protected void doClose() throws IOException { in.close(); }
>  
> -  
> +
>    public Collection getFieldNames(IndexReader.FieldOption fieldNames) {
>      return in.getFieldNames(fieldNames);
>    }
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java Fri Jun  9 18:23:22 2006
> @@ -273,7 +273,7 @@
>      }
>    }
>  
> -  
> +
>    /**
>     * Returns the number of documents currently in this index.
>     * @see IndexWriter#docCount()
> @@ -407,7 +407,7 @@
>     * the number of files open in a FSDirectory.
>     *
>     * <p>The default value is 10.
> -   * 
> +   *
>     * @see IndexWriter#setMaxBufferedDocs(int)
>     * @throws IllegalStateException if the index is closed
>     * @throws IllegalArgumentException if maxBufferedDocs is smaller than 2
> @@ -500,8 +500,8 @@
>      // create an index in /tmp/index, overwriting an existing one:
>      IndexModifier indexModifier = new IndexModifier("/tmp/index", analyzer, true);
>      Document doc = new Document();
> -    doc.add(new Field("id", "1", Field.Store.YES, Field.Index.UN_TOKENIZED));
> -    doc.add(new Field("body", "a simple test", Field.Store.YES, Field.Index.TOKENIZED));
> +    doc.add(new Fieldable("id", "1", Fieldable.Store.YES, Fieldable.Index.UN_TOKENIZED));
> +    doc.add(new Fieldable("body", "a simple test", Fieldable.Store.YES, Fieldable.Index.TOKENIZED));
>      indexModifier.addDocument(doc);
>      int deleted = indexModifier.delete(new Term("id", "1"));
>      System.out.println("Deleted " + deleted + " document");
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message