oops, will take care of it today! It wouldn't be right if my first real
commit went through with no problems! :-)
Otis Gospodnetic wrote:
> Nice and meaty :)
> I saw CNLP copyright notices in javadocs. I think those are from your IntelliJ templates and have to be removed or some peeps will get mad at us...
>
> Otis
>
> ----- Original Message ----
> From: gsingers@apache.org
> To: java-commits@lucene.apache.org
> Sent: Friday, June 9, 2006 9:23:24 PM
> Subject: svn commit: r413201 [1/2] - in /lucene/java/trunk: ./ contrib/memory/src/java/org/apache/lucene/index/memory/ contrib/swing/src/java/org/apache/lucene/swing/models/ src/java/org/apache/lucene/analysis/ src/java/org/apache/lucene/document/ src/java/org/...
>
> Author: gsingers
> Date: Fri Jun 9 18:23:22 2006
> New Revision: 413201
>
> URL: http://svn.apache.org/viewvc?rev=413201&view=rev
> Log:
> Implementation of Issue 545. Introduction of new Fieldable interface (extracted from Field) which is now used where Field used to be used. Field now implements Fieldable.
> Added new method to IndexReader and derived classes for working with the new FieldSelector interface. The FieldSelector interface defines a mechanism for doing lazy loading, amongst other things. Implemented Lazy loading of fields in the FieldsReader class. Added test case to TestFieldsReader.java
>
> Added:
> lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java
> lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java (with props)
> lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java (with props)
> Modified:
> lucene/java/trunk/CHANGES.txt
> lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java
> lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java
> lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java
> lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java
> lucene/java/trunk/src/java/org/apache/lucene/document/Document.java
> lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
> lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java
> lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java
> lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java
> lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/MultiReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/ParallelReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java
> lucene/java/trunk/src/java/org/apache/lucene/index/SegmentReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsReader.java
> lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsWriter.java
> lucene/java/trunk/src/java/org/apache/lucene/search/FieldCacheImpl.java
> lucene/java/trunk/src/java/org/apache/lucene/search/FieldDocSortedHitQueue.java
> lucene/java/trunk/src/java/org/apache/lucene/search/FieldSortedHitQueue.java
> lucene/java/trunk/src/java/org/apache/lucene/search/Similarity.java
> lucene/java/trunk/src/java/org/apache/lucene/search/Sort.java
> lucene/java/trunk/src/java/org/apache/lucene/search/SortComparatorSource.java
> lucene/java/trunk/src/java/org/apache/lucene/store/IndexInput.java
> lucene/java/trunk/src/test/org/apache/lucene/document/TestBinaryDocument.java
> lucene/java/trunk/src/test/org/apache/lucene/document/TestDocument.java
> lucene/java/trunk/src/test/org/apache/lucene/index/DocHelper.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestDocumentWriter.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestFieldsReader.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexInput.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexModifier.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestParallelReader.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestSegmentMerger.java
> lucene/java/trunk/src/test/org/apache/lucene/index/TestSegmentReader.java
> lucene/java/trunk/src/test/org/apache/lucene/search/TestDocBoost.java
> lucene/java/trunk/src/test/org/apache/lucene/search/TestMultiThreadTermVectors.java
> lucene/java/trunk/src/test/org/apache/lucene/search/TestPhraseQuery.java
> lucene/java/trunk/src/test/org/apache/lucene/search/TestSetNorm.java
>
> Modified: lucene/java/trunk/CHANGES.txt
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/CHANGES.txt?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/CHANGES.txt (original)
> +++ lucene/java/trunk/CHANGES.txt Fri Jun 9 18:23:22 2006
> @@ -9,6 +9,9 @@
> 1. LUCENE-503: New ThaiAnalyzer and ThaiWordFilter in contrib/analyzers
> (Samphan Raruenrom va Chris Hostetter)
>
> + 2. LUCENE-545: New FieldSelector API and associated changes to IndexReader and implementations.
> + New Fieldable interface for use with the lazy field loading mechanism. (Grant Ingersoll and Chuck Williams via Grant Ingersoll)
> +
> API Changes
>
> 1. LUCENE-438: Remove "final" from Token, implement Cloneable, allow
>
> Modified: lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java (original)
> +++ lucene/java/trunk/contrib/memory/src/java/org/apache/lucene/index/memory/MemoryIndex.java Fri Jun 9 18:23:22 2006
> @@ -16,20 +16,11 @@
> * limitations under the License.
> */
>
> -import java.io.IOException;
> -import java.io.Serializable;
> -import java.util.Arrays;
> -import java.util.Collection;
> -import java.util.Collections;
> -import java.util.Comparator;
> -import java.util.HashMap;
> -import java.util.Iterator;
> -import java.util.Map;
> -
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.analysis.Token;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.FieldSelector;
> import org.apache.lucene.index.IndexReader;
> import org.apache.lucene.index.Term;
> import org.apache.lucene.index.TermDocs;
> @@ -43,6 +34,16 @@
> import org.apache.lucene.search.Searcher;
> import org.apache.lucene.search.Similarity;
>
> +import java.io.IOException;
> +import java.io.Serializable;
> +import java.util.Arrays;
> +import java.util.Collection;
> +import java.util.Collections;
> +import java.util.Comparator;
> +import java.util.HashMap;
> +import java.util.Iterator;
> +import java.util.Map;
> +
> /**
> * High-performance single-document main memory Apache Lucene fulltext search index.
> *
> @@ -1004,8 +1005,14 @@
> if (DEBUG) System.err.println("MemoryIndexReader.document");
> return new Document(); // there are no stored fields
> }
> -
> - public boolean isDeleted(int n) {
> +
> + //When we convert to JDK 1.5 make this Set<String>
> + public Document document(int n, FieldSelector fieldSelector) throws IOException {
> + if (DEBUG) System.err.println("MemoryIndexReader.document");
> + return new Document(); // there are no stored fields
> + }
> +
> + public boolean isDeleted(int n) {
> if (DEBUG) System.err.println("MemoryIndexReader.isDeleted");
> return false;
> }
>
> Modified: lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java (original)
> +++ lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/ListSearcher.java Fri Jun 9 18:23:22 2006
> @@ -22,6 +22,7 @@
> import org.apache.lucene.index.IndexWriter;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
> +import org.apache.lucene.document.Fieldable;
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.search.Hits;
> @@ -190,7 +191,7 @@
> //tabble model row that we are mapping to
> for (int t=0; t<hits.length(); t++){
> Document document = hits.doc(t);
> - Field field = document.getField(ROW_NUMBER);
> + Fieldable field = document.getField(ROW_NUMBER);
> rowToModelIndex.add(new Integer(field.stringValue()));
> }
> } catch (Exception e){
>
> Modified: lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java (original)
> +++ lucene/java/trunk/contrib/swing/src/java/org/apache/lucene/swing/models/TableSearcher.java Fri Jun 9 18:23:22 2006
> @@ -16,26 +16,23 @@
> * limitations under the License.
> */
>
> -import org.apache.lucene.store.RAMDirectory;
> +import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.analysis.WhitespaceAnalyzer;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
> -import org.apache.lucene.analysis.WhitespaceAnalyzer;
> -import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.document.Fieldable;
> import org.apache.lucene.index.IndexWriter;
> +import org.apache.lucene.queryParser.MultiFieldQueryParser;
> +import org.apache.lucene.search.Hits;
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.search.Query;
> -import org.apache.lucene.search.Hits;
> -import org.apache.lucene.queryParser.MultiFieldQueryParser;
> -
> -import java.awt.*;
> -import java.awt.event.*;
> -import java.util.*;
> -import java.util.List;
> +import org.apache.lucene.store.RAMDirectory;
>
> -import javax.swing.*;
> import javax.swing.event.TableModelEvent;
> import javax.swing.event.TableModelListener;
> -import javax.swing.table.*;
> +import javax.swing.table.AbstractTableModel;
> +import javax.swing.table.TableModel;
> +import java.util.ArrayList;
>
>
> /**
> @@ -275,7 +272,7 @@
> //tabble model row that we are mapping to
> for (int t=0; t<hits.length(); t++){
> Document document = hits.doc(t);
> - Field field = document.getField(ROW_NUMBER);
> + Fieldable field = document.getField(ROW_NUMBER);
> rowToModelIndex.add(new Integer(field.stringValue()));
> }
> } catch (Exception e){
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/analysis/Analyzer.java Fri Jun 9 18:23:22 2006
> @@ -38,16 +38,16 @@
>
>
> /**
> - * Invoked before indexing a Field instance if
> + * Invoked before indexing a Fieldable instance if
> * terms have already been added to that field. This allows custom
> * analyzers to place an automatic position increment gap between
> - * Field instances using the same field name. The default value
> + * Fieldable instances using the same field name. The default value
> * position increment gap is 0. With a 0 position increment gap and
> * the typical default token position increment of 1, all terms in a field,
> - * including across Field instances, are in successive positions, allowing
> - * exact PhraseQuery matches, for instance, across Field instance boundaries.
> + * including across Fieldable instances, are in successive positions, allowing
> + * exact PhraseQuery matches, for instance, across Fieldable instance boundaries.
> *
> - * @param fieldName Field name being indexed.
> + * @param fieldName Fieldable name being indexed.
> * @return position increment gap, added to the next token emitted from {@link #tokenStream(String,Reader)}
> */
> public int getPositionIncrementGap(String fieldName)
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,274 @@
> +package org.apache.lucene.document;
> +/**
> + * Copyright 2006 The Apache Software Foundation
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +
> +/**
> + *
> + *
> + **/
> +public abstract class AbstractField implements Fieldable {
> +
> + protected String name = "body";
> + protected boolean storeTermVector = false;
> + protected boolean storeOffsetWithTermVector = false;
> + protected boolean storePositionWithTermVector = false;
> + protected boolean omitNorms = false;
> + protected boolean isStored = false;
> + protected boolean isIndexed = true;
> + protected boolean isTokenized = true;
> + protected boolean isBinary = false;
> + protected boolean isCompressed = false;
> + protected boolean lazy = false;
> + protected float boost = 1.0f;
> + // the one and only data object for all different kind of field values
> + protected Object fieldsData = null;
> +
> + protected AbstractField()
> + {
> +
> + }
> +
> + protected AbstractField(String name, Field.Store store, Field.Index index, Field.TermVector termVector) {
> + if (name == null)
> + throw new NullPointerException("name cannot be null");
> + this.name = name.intern(); // field names are interned
> +
> + if (store == Field.Store.YES){
> + this.isStored = true;
> + this.isCompressed = false;
> + }
> + else if (store == Field.Store.COMPRESS) {
> + this.isStored = true;
> + this.isCompressed = true;
> + }
> + else if (store == Field.Store.NO){
> + this.isStored = false;
> + this.isCompressed = false;
> + }
> + else
> + throw new IllegalArgumentException("unknown store parameter " + store);
> +
> + if (index == Field.Index.NO) {
> + this.isIndexed = false;
> + this.isTokenized = false;
> + } else if (index == Field.Index.TOKENIZED) {
> + this.isIndexed = true;
> + this.isTokenized = true;
> + } else if (index == Field.Index.UN_TOKENIZED) {
> + this.isIndexed = true;
> + this.isTokenized = false;
> + } else if (index == Field.Index.NO_NORMS) {
> + this.isIndexed = true;
> + this.isTokenized = false;
> + this.omitNorms = true;
> + } else {
> + throw new IllegalArgumentException("unknown index parameter " + index);
> + }
> +
> + this.isBinary = false;
> +
> + setStoreTermVector(termVector);
> + }
> +
> + /** Sets the boost factor hits on this field. This value will be
> + * multiplied into the score of all hits on this this field of this
> + * document.
> + *
> + * <p>The boost is multiplied by {@link org.apache.lucene.document.Document#getBoost()} of the document
> + * containing this field. If a document has multiple fields with the same
> + * name, all such values are multiplied together. This product is then
> + * multipled by the value {@link org.apache.lucene.search.Similarity#lengthNorm(String,int)}, and
> + * rounded by {@link org.apache.lucene.search.Similarity#encodeNorm(float)} before it is stored in the
> + * index. One should attempt to ensure that this product does not overflow
> + * the range of that encoding.
> + *
> + * @see org.apache.lucene.document.Document#setBoost(float)
> + * @see org.apache.lucene.search.Similarity#lengthNorm(String, int)
> + * @see org.apache.lucene.search.Similarity#encodeNorm(float)
> + */
> + public void setBoost(float boost) {
> + this.boost = boost;
> + }
> +
> + /** Returns the boost factor for hits for this field.
> + *
> + * <p>The default value is 1.0.
> + *
> + * <p>Note: this value is not stored directly with the document in the index.
> + * Documents returned from {@link org.apache.lucene.index.IndexReader#document(int)} and
> + * {@link org.apache.lucene.search.Hits#doc(int)} may thus not have the same value present as when
> + * this field was indexed.
> + *
> + * @see #setBoost(float)
> + */
> + public float getBoost() {
> + return boost;
> + }
> +
> + /** Returns the name of the field as an interned string.
> + * For example "date", "title", "body", ...
> + */
> + public String name() { return name; }
> +
> + protected void setStoreTermVector(Field.TermVector termVector) {
> + if (termVector == Field.TermVector.NO) {
> + this.storeTermVector = false;
> + this.storePositionWithTermVector = false;
> + this.storeOffsetWithTermVector = false;
> + }
> + else if (termVector == Field.TermVector.YES) {
> + this.storeTermVector = true;
> + this.storePositionWithTermVector = false;
> + this.storeOffsetWithTermVector = false;
> + }
> + else if (termVector == Field.TermVector.WITH_POSITIONS) {
> + this.storeTermVector = true;
> + this.storePositionWithTermVector = true;
> + this.storeOffsetWithTermVector = false;
> + }
> + else if (termVector == Field.TermVector.WITH_OFFSETS) {
> + this.storeTermVector = true;
> + this.storePositionWithTermVector = false;
> + this.storeOffsetWithTermVector = true;
> + }
> + else if (termVector == Field.TermVector.WITH_POSITIONS_OFFSETS) {
> + this.storeTermVector = true;
> + this.storePositionWithTermVector = true;
> + this.storeOffsetWithTermVector = true;
> + }
> + else {
> + throw new IllegalArgumentException("unknown termVector parameter " + termVector);
> + }
> + }
> +
> + /** True iff the value of the field is to be stored in the index for return
> + with search hits. It is an error for this to be true if a field is
> + Reader-valued. */
> + public final boolean isStored() { return isStored; }
> +
> + /** True iff the value of the field is to be indexed, so that it may be
> + searched on. */
> + public final boolean isIndexed() { return isIndexed; }
> +
> + /** True iff the value of the field should be tokenized as text prior to
> + indexing. Un-tokenized fields are indexed as a single word and may not be
> + Reader-valued. */
> + public final boolean isTokenized() { return isTokenized; }
> +
> + /** True if the value of the field is stored and compressed within the index */
> + public final boolean isCompressed() { return isCompressed; }
> +
> + /** True iff the term or terms used to index this field are stored as a term
> + * vector, available from {@link org.apache.lucene.index.IndexReader#getTermFreqVector(int,String)}.
> + * These methods do not provide access to the original content of the field,
> + * only to terms used to index it. If the original content must be
> + * preserved, use the <code>stored</code> attribute instead.
> + *
> + * @see org.apache.lucene.index.IndexReader#getTermFreqVector(int, String)
> + */
> + public final boolean isTermVectorStored() { return storeTermVector; }
> +
> + /**
> + * True iff terms are stored as term vector together with their offsets
> + * (start and end positon in source text).
> + */
> + public boolean isStoreOffsetWithTermVector(){
> + return storeOffsetWithTermVector;
> + }
> +
> + /**
> + * True iff terms are stored as term vector together with their token positions.
> + */
> + public boolean isStorePositionWithTermVector(){
> + return storePositionWithTermVector;
> + }
> +
> + /** True iff the value of the filed is stored as binary */
> + public final boolean isBinary() { return isBinary; }
> +
> + /** True if norms are omitted for this indexed field */
> + public boolean getOmitNorms() { return omitNorms; }
> +
> + /** Expert:
> + *
> + * If set, omit normalization factors associated with this indexed field.
> + * This effectively disables indexing boosts and length normalization for this field.
> + */
> + public void setOmitNorms(boolean omitNorms) { this.omitNorms=omitNorms; }
> +
> + public boolean isLazy() {
> + return lazy;
> + }
> +
> + /** Prints a Field for human consumption. */
> + public final String toString() {
> + StringBuffer result = new StringBuffer();
> + if (isStored) {
> + result.append("stored");
> + if (isCompressed)
> + result.append("/compressed");
> + else
> + result.append("/uncompressed");
> + }
> + if (isIndexed) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("indexed");
> + }
> + if (isTokenized) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("tokenized");
> + }
> + if (storeTermVector) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("termVector");
> + }
> + if (storeOffsetWithTermVector) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("termVectorOffsets");
> + }
> + if (storePositionWithTermVector) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("termVectorPosition");
> + }
> + if (isBinary) {
> + if (result.length() > 0)
> + result.append(",");
> + result.append("binary");
> + }
> + if (omitNorms) {
> + result.append(",omitNorms");
> + }
> + if (lazy){
> + result.append(",lazy");
> + }
> + result.append('<');
> + result.append(name);
> + result.append(':');
> +
> + if (fieldsData != null && lazy == false) {
> + result.append(fieldsData);
> + }
> +
> + result.append('>');
> + return result.toString();
> + }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/AbstractField.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Document.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Document.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Document.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Document.java Fri Jun 9 18:23:22 2006
> @@ -16,24 +16,21 @@
> * limitations under the License.
> */
>
> -import java.util.Enumeration;
> -import java.util.Iterator;
> -import java.util.List;
> -import java.util.ArrayList;
> -import java.util.Vector;
> -import org.apache.lucene.index.IndexReader; // for javadoc
> -import org.apache.lucene.search.Searcher; // for javadoc
> -import org.apache.lucene.search.Hits; // for javadoc
> +import org.apache.lucene.index.IndexReader;
> +import org.apache.lucene.search.Hits;
> +import org.apache.lucene.search.Searcher;
> +
> +import java.util.*; // for javadoc
>
> /** Documents are the unit of indexing and search.
> *
> * A Document is a set of fields. Each field has a name and a textual value.
> - * A field may be {@link Field#isStored() stored} with the document, in which
> + * A field may be {@link Fieldable#isStored() stored} with the document, in which
> * case it is returned with search hits on the document. Thus each document
> * should typically contain one or more stored fields which uniquely identify
> * it.
> *
> - * <p>Note that fields which are <i>not</i> {@link Field#isStored() stored} are
> + * <p>Note that fields which are <i>not</i> {@link Fieldable#isStored() stored} are
> * <i>not</i> available in documents retrieved from the index, e.g. with {@link
> * Hits#doc(int)}, {@link Searcher#doc(int)} or {@link
> * IndexReader#document(int)}.
> @@ -50,11 +47,11 @@
> /** Sets a boost factor for hits on any field of this document. This value
> * will be multiplied into the score of all hits on this document.
> *
> - * <p>Values are multiplied into the value of {@link Field#getBoost()} of
> + * <p>Values are multiplied into the value of {@link Fieldable#getBoost()} of
> * each field in this document. Thus, this method in effect sets a default
> * boost for the fields of this document.
> *
> - * @see Field#setBoost(float)
> + * @see Fieldable#setBoost(float)
> */
> public void setBoost(float boost) {
> this.boost = boost;
> @@ -85,7 +82,7 @@
> * a document has to be deleted from an index and a new changed version of that
> * document has to be added.</p>
> */
> - public final void add(Field field) {
> + public final void add(Fieldable field) {
> fields.add(field);
> }
>
> @@ -102,7 +99,7 @@
> public final void removeField(String name) {
> Iterator it = fields.iterator();
> while (it.hasNext()) {
> - Field field = (Field)it.next();
> + Fieldable field = (Fieldable)it.next();
> if (field.name().equals(name)) {
> it.remove();
> return;
> @@ -122,7 +119,7 @@
> public final void removeFields(String name) {
> Iterator it = fields.iterator();
> while (it.hasNext()) {
> - Field field = (Field)it.next();
> + Fieldable field = (Fieldable)it.next();
> if (field.name().equals(name)) {
> it.remove();
> }
> @@ -133,9 +130,9 @@
> * null. If multiple fields exists with this name, this method returns the
> * first value added.
> */
> - public final Field getField(String name) {
> + public final Fieldable getField(String name) {
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name))
> return field;
> }
> @@ -149,7 +146,7 @@
> */
> public final String get(String name) {
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name) && (!field.isBinary()))
> return field.stringValue();
> }
> @@ -162,16 +159,16 @@
> }
>
> /**
> - * Returns an array of {@link Field}s with the given name.
> + * Returns an array of {@link Fieldable}s with the given name.
> * This method can return <code>null</code>.
> *
> * @param name the name of the field
> - * @return a <code>Field[]</code> array
> + * @return a <code>Fieldable[]</code> array
> */
> - public final Field[] getFields(String name) {
> + public final Fieldable[] getFields(String name) {
> List result = new ArrayList();
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name)) {
> result.add(field);
> }
> @@ -180,7 +177,7 @@
> if (result.size() == 0)
> return null;
>
> - return (Field[])result.toArray(new Field[result.size()]);
> + return (Fieldable[])result.toArray(new Fieldable[result.size()]);
> }
>
> /**
> @@ -193,7 +190,7 @@
> public final String[] getValues(String name) {
> List result = new ArrayList();
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name) && (!field.isBinary()))
> result.add(field.stringValue());
> }
> @@ -215,7 +212,7 @@
> public final byte[][] getBinaryValues(String name) {
> List result = new ArrayList();
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name) && (field.isBinary()))
> result.add(field.binaryValue());
> }
> @@ -237,7 +234,7 @@
> */
> public final byte[] getBinaryValue(String name) {
> for (int i=0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> if (field.name().equals(name) && (field.isBinary()))
> return field.binaryValue();
> }
> @@ -249,7 +246,7 @@
> StringBuffer buffer = new StringBuffer();
> buffer.append("Document<");
> for (int i = 0; i < fields.size(); i++) {
> - Field field = (Field)fields.get(i);
> + Fieldable field = (Fieldable)fields.get(i);
> buffer.append(field.toString());
> if (i != fields.size()-1)
> buffer.append(" ");
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/document/Field.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Field.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Field.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Field.java Fri Jun 9 18:23:22 2006
> @@ -16,9 +16,6 @@
> * limitations under the License.
> */
>
> -import org.apache.lucene.index.IndexReader;
> -import org.apache.lucene.search.Hits;
> -import org.apache.lucene.search.Similarity;
> import org.apache.lucene.util.Parameter;
>
> import java.io.Reader;
> @@ -32,23 +29,7 @@
> index, so that they may be returned with hits on the document.
> */
>
> -public final class Field implements Serializable {
> - private String name = "body";
> -
> - // the one and only data object for all different kind of field values
> - private Object fieldsData = null;
> -
> - private boolean storeTermVector = false;
> - private boolean storeOffsetWithTermVector = false;
> - private boolean storePositionWithTermVector = false;
> - private boolean omitNorms = false;
> - private boolean isStored = false;
> - private boolean isIndexed = true;
> - private boolean isTokenized = true;
> - private boolean isBinary = false;
> - private boolean isCompressed = false;
> -
> - private float boost = 1.0f;
> +public final class Field extends AbstractField implements Fieldable, Serializable {
>
> /** Specifies whether and how a field should be stored. */
> public static final class Store extends Parameter implements Serializable {
> @@ -146,45 +127,7 @@
> public static final TermVector WITH_POSITIONS_OFFSETS = new TermVector("WITH_POSITIONS_OFFSETS");
> }
>
> - /** Sets the boost factor hits on this field. This value will be
> - * multiplied into the score of all hits on this this field of this
> - * document.
> - *
> - * <p>The boost is multiplied by {@link Document#getBoost()} of the document
> - * containing this field. If a document has multiple fields with the same
> - * name, all such values are multiplied together. This product is then
> - * multipled by the value {@link Similarity#lengthNorm(String,int)}, and
> - * rounded by {@link Similarity#encodeNorm(float)} before it is stored in the
> - * index. One should attempt to ensure that this product does not overflow
> - * the range of that encoding.
> - *
> - * @see Document#setBoost(float)
> - * @see Similarity#lengthNorm(String, int)
> - * @see Similarity#encodeNorm(float)
> - */
> - public void setBoost(float boost) {
> - this.boost = boost;
> - }
> -
> - /** Returns the boost factor for hits for this field.
> - *
> - * <p>The default value is 1.0.
> - *
> - * <p>Note: this value is not stored directly with the document in the index.
> - * Documents returned from {@link IndexReader#document(int)} and
> - * {@link Hits#doc(int)} may thus not have the same value present as when
> - * this field was indexed.
> - *
> - * @see #setBoost(float)
> - */
> - public float getBoost() {
> - return boost;
> - }
> - /** Returns the name of the field as an interned string.
> - * For example "date", "title", "body", ...
> - */
> - public String name() { return name; }
> -
> +
> /** The value of the field as a String, or null. If null, the Reader value
> * or binary value is used. Exactly one of stringValue(), readerValue(), and
> * binaryValue() must be set. */
> @@ -365,146 +308,6 @@
>
> setStoreTermVector(TermVector.NO);
> }
> -
> - private void setStoreTermVector(TermVector termVector) {
> - if (termVector == TermVector.NO) {
> - this.storeTermVector = false;
> - this.storePositionWithTermVector = false;
> - this.storeOffsetWithTermVector = false;
> - }
> - else if (termVector == TermVector.YES) {
> - this.storeTermVector = true;
> - this.storePositionWithTermVector = false;
> - this.storeOffsetWithTermVector = false;
> - }
> - else if (termVector == TermVector.WITH_POSITIONS) {
> - this.storeTermVector = true;
> - this.storePositionWithTermVector = true;
> - this.storeOffsetWithTermVector = false;
> - }
> - else if (termVector == TermVector.WITH_OFFSETS) {
> - this.storeTermVector = true;
> - this.storePositionWithTermVector = false;
> - this.storeOffsetWithTermVector = true;
> - }
> - else if (termVector == TermVector.WITH_POSITIONS_OFFSETS) {
> - this.storeTermVector = true;
> - this.storePositionWithTermVector = true;
> - this.storeOffsetWithTermVector = true;
> - }
> - else {
> - throw new IllegalArgumentException("unknown termVector parameter " + termVector);
> - }
> - }
> -
> - /** True iff the value of the field is to be stored in the index for return
> - with search hits. It is an error for this to be true if a field is
> - Reader-valued. */
> - public final boolean isStored() { return isStored; }
> -
> - /** True iff the value of the field is to be indexed, so that it may be
> - searched on. */
> - public final boolean isIndexed() { return isIndexed; }
> -
> - /** True iff the value of the field should be tokenized as text prior to
> - indexing. Un-tokenized fields are indexed as a single word and may not be
> - Reader-valued. */
> - public final boolean isTokenized() { return isTokenized; }
> -
> - /** True if the value of the field is stored and compressed within the index */
> - public final boolean isCompressed() { return isCompressed; }
>
> - /** True iff the term or terms used to index this field are stored as a term
> - * vector, available from {@link IndexReader#getTermFreqVector(int,String)}.
> - * These methods do not provide access to the original content of the field,
> - * only to terms used to index it. If the original content must be
> - * preserved, use the <code>stored</code> attribute instead.
> - *
> - * @see IndexReader#getTermFreqVector(int, String)
> - */
> - public final boolean isTermVectorStored() { return storeTermVector; }
> -
> - /**
> - * True iff terms are stored as term vector together with their offsets
> - * (start and end positon in source text).
> - */
> - public boolean isStoreOffsetWithTermVector(){
> - return storeOffsetWithTermVector;
> - }
> -
> - /**
> - * True iff terms are stored as term vector together with their token positions.
> - */
> - public boolean isStorePositionWithTermVector(){
> - return storePositionWithTermVector;
> - }
> -
> - /** True iff the value of the filed is stored as binary */
> - public final boolean isBinary() { return isBinary; }
> -
> - /** True if norms are omitted for this indexed field */
> - public boolean getOmitNorms() { return omitNorms; }
> -
> - /** Expert:
> - *
> - * If set, omit normalization factors associated with this indexed field.
> - * This effectively disables indexing boosts and length normalization for this field.
> - */
> - public void setOmitNorms(boolean omitNorms) { this.omitNorms=omitNorms; }
> -
> - /** Prints a Field for human consumption. */
> - public final String toString() {
> - StringBuffer result = new StringBuffer();
> - if (isStored) {
> - result.append("stored");
> - if (isCompressed)
> - result.append("/compressed");
> - else
> - result.append("/uncompressed");
> - }
> - if (isIndexed) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("indexed");
> - }
> - if (isTokenized) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("tokenized");
> - }
> - if (storeTermVector) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("termVector");
> - }
> - if (storeOffsetWithTermVector) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("termVectorOffsets");
> - }
> - if (storePositionWithTermVector) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("termVectorPosition");
> - }
> - if (isBinary) {
> - if (result.length() > 0)
> - result.append(",");
> - result.append("binary");
> - }
> - if (omitNorms) {
> - result.append(",omitNorms");
> - }
> - result.append('<');
> - result.append(name);
> - result.append(':');
> -
> - if (fieldsData != null) {
> - result.append(fieldsData);
> - }
> -
> - result.append('>');
> - return result.toString();
> - }
>
> }
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,24 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 5:29:26 PM
> + * $Id:$
> + * Copyright 2005. Center For Natural Language Processing
> + */
> +
> +/**
> + * Similar to a {@link java.io.FileFilter}, the FieldSelector allows one to make decisions about
> + * what Fields get loaded on a {@link Document} by {@link org.apache.lucene.index.IndexReader#document(int,org.apache.lucene.document.FieldSelector)}
> + *
> + **/
> +public interface FieldSelector {
> +
> + /**
> + *
> + * @param fieldName
> + * @return true if the {@link Field} with <code>fieldName</code> should be loaded or not
> + */
> + FieldSelectorResult accept(String fieldName);
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelector.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,44 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 5:40:17 PM
> + * $Id:$
> + * Copyright 2005. Center For Natural Language Processing
> + */
> +
> +/**
> + * Provides information about what should be done with this Field
> + *
> + **/
> +//Replace with an enumerated type in 1.5
> +public final class FieldSelectorResult {
> +
> + public static final FieldSelectorResult LOAD = new FieldSelectorResult(0);
> + public static final FieldSelectorResult LAZY_LOAD = new FieldSelectorResult(1);
> + public static final FieldSelectorResult NO_LOAD = new FieldSelectorResult(2);
> + public static final FieldSelectorResult LOAD_AND_BREAK = new FieldSelectorResult(3);
> +
> + private int id;
> +
> + private FieldSelectorResult(int id)
> + {
> + this.id = id;
> + }
> +
> + public boolean equals(Object o) {
> + if (this == o) return true;
> + if (o == null || getClass() != o.getClass()) return false;
> +
> + final FieldSelectorResult that = (FieldSelectorResult) o;
> +
> + if (id != that.id) return false;
> +
> + return true;
> + }
> +
> + public int hashCode() {
> + return id;
> + }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/FieldSelectorResult.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,137 @@
> +package org.apache.lucene.document;
> +
> +/**
> + * Copyright 2004 The Apache Software Foundation
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +import java.io.Reader;
> +import java.io.Serializable;
> +
> +/**
> + * Synonymous with {@link Field}.
> + *
> + **/
> +public interface Fieldable extends Serializable {
> + /** Sets the boost factor hits on this field. This value will be
> + * multiplied into the score of all hits on this this field of this
> + * document.
> + *
> + * <p>The boost is multiplied by {@link org.apache.lucene.document.Document#getBoost()} of the document
> + * containing this field. If a document has multiple fields with the same
> + * name, all such values are multiplied together. This product is then
> + * multipled by the value {@link org.apache.lucene.search.Similarity#lengthNorm(String,int)}, and
> + * rounded by {@link org.apache.lucene.search.Similarity#encodeNorm(float)} before it is stored in the
> + * index. One should attempt to ensure that this product does not overflow
> + * the range of that encoding.
> + *
> + * @see org.apache.lucene.document.Document#setBoost(float)
> + * @see org.apache.lucene.search.Similarity#lengthNorm(String, int)
> + * @see org.apache.lucene.search.Similarity#encodeNorm(float)
> + */
> + void setBoost(float boost);
> +
> + /** Returns the boost factor for hits for this field.
> + *
> + * <p>The default value is 1.0.
> + *
> + * <p>Note: this value is not stored directly with the document in the index.
> + * Documents returned from {@link org.apache.lucene.index.IndexReader#document(int)} and
> + * {@link org.apache.lucene.search.Hits#doc(int)} may thus not have the same value present as when
> + * this field was indexed.
> + *
> + * @see #setBoost(float)
> + */
> + float getBoost();
> +
> + /** Returns the name of the field as an interned string.
> + * For example "date", "title", "body", ...
> + */
> + String name();
> +
> + /** The value of the field as a String, or null. If null, the Reader value
> + * or binary value is used. Exactly one of stringValue(), readerValue(), and
> + * binaryValue() must be set. */
> + String stringValue();
> +
> + /** The value of the field as a Reader, or null. If null, the String value
> + * or binary value is used. Exactly one of stringValue(), readerValue(),
> + * and binaryValue() must be set. */
> + Reader readerValue();
> +
> + /** The value of the field in Binary, or null. If null, the Reader or
> + * String value is used. Exactly one of stringValue(), readerValue() and
> + * binaryValue() must be set. */
> + byte[] binaryValue();
> +
> + /** True iff the value of the field is to be stored in the index for return
> + with search hits. It is an error for this to be true if a field is
> + Reader-valued. */
> + boolean isStored();
> +
> + /** True iff the value of the field is to be indexed, so that it may be
> + searched on. */
> + boolean isIndexed();
> +
> + /** True iff the value of the field should be tokenized as text prior to
> + indexing. Un-tokenized fields are indexed as a single word and may not be
> + Reader-valued. */
> + boolean isTokenized();
> +
> + /** True if the value of the field is stored and compressed within the index */
> + boolean isCompressed();
> +
> + /** True iff the term or terms used to index this field are stored as a term
> + * vector, available from {@link org.apache.lucene.index.IndexReader#getTermFreqVector(int,String)}.
> + * These methods do not provide access to the original content of the field,
> + * only to terms used to index it. If the original content must be
> + * preserved, use the <code>stored</code> attribute instead.
> + *
> + * @see org.apache.lucene.index.IndexReader#getTermFreqVector(int, String)
> + */
> + boolean isTermVectorStored();
> +
> + /**
> + * True iff terms are stored as term vector together with their offsets
> + * (start and end positon in source text).
> + */
> + boolean isStoreOffsetWithTermVector();
> +
> + /**
> + * True iff terms are stored as term vector together with their token positions.
> + */
> + boolean isStorePositionWithTermVector();
> +
> + /** True iff the value of the filed is stored as binary */
> + boolean isBinary();
> +
> + /** True if norms are omitted for this indexed field */
> + boolean getOmitNorms();
> +
> + /** Expert:
> + *
> + * If set, omit normalization factors associated with this indexed field.
> + * This effectively disables indexing boosts and length normalization for this field.
> + */
> + void setOmitNorms(boolean omitNorms);
> +
> + /**
> + * Indicates whether a Field is Lazy or not. The semantics of Lazy loading are such that if a Field is lazily loaded, retrieving
> + * it's values via {@link #stringValue()} or {@link #binaryValue()} is only valid as long as the {@link org.apache.lucene.index.IndexReader} that
> + * retrieved the {@link Document} is still open.
> + *
> + * @return true if this field can be loaded lazily
> + */
> + boolean isLazy();
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/Fieldable.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,22 @@
> +package org.apache.lucene.document;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 15, 2006
> + * Time: 10:13:07 AM
> + * $Id:$
> + * Copyright 2005. Center For Natural Language Processing
> + */
> +
> +
> +/**
> + * Load the First field and break.
> + * <p/>
> + * See {@link FieldSelectorResult#LOAD_AND_BREAK}
> + */
> +public class LoadFirstFieldSelector implements FieldSelector {
> +
> + public FieldSelectorResult accept(String fieldName) {
> + return FieldSelectorResult.LOAD_AND_BREAK;
> + }
> +}
> \ No newline at end of file
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/LoadFirstFieldSelector.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/MapFieldSelector.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,57 @@
> +/*
> + * MapFieldSelector.java
> + *
> + * Created on May 2, 2006, 6:49 PM
> + *
> + */
> +
> +package org.apache.lucene.document;
> +
> +import java.util.HashMap;
> +import java.util.List;
> +import java.util.Map;
> +
> +/**
> + * A FieldSelector based on a Map of field names to FieldSelectorResults
> + *
> + * @author Chuck Williams
> + */
> +public class MapFieldSelector implements FieldSelector {
> +
> + Map fieldSelections;
> +
> + /** Create a a MapFieldSelector
> + * @param fieldSelections maps from field names to FieldSelectorResults
> + */
> + public MapFieldSelector(Map fieldSelections) {
> + this.fieldSelections = fieldSelections;
> + }
> +
> + /** Create a a MapFieldSelector
> + * @param fields fields to LOAD. All other fields are NO_LOAD.
> + */
> + public MapFieldSelector(List fields) {
> + fieldSelections = new HashMap(fields.size()*5/3);
> + for (int i=0; i<fields.size(); i++)
> + fieldSelections.put(fields.get(i), FieldSelectorResult.LOAD);
> + }
> +
> + /** Create a a MapFieldSelector
> + * @param fields fields to LOAD. All other fields are NO_LOAD.
> + */
> + public MapFieldSelector(String[] fields) {
> + fieldSelections = new HashMap(fields.length*5/3);
> + for (int i=0; i<fields.length; i++)
> + fieldSelections.put(fields[i], FieldSelectorResult.LOAD);
> + }
> +
> + /** Load field according to its associated value in fieldSelections
> + * @param field a field name
> + * @return the fieldSelections value that field maps to or NO_LOAD if none.
> + */
> + public FieldSelectorResult accept(String field) {
> + FieldSelectorResult selection = (FieldSelectorResult) fieldSelections.get(field);
> + return selection!=null ? selection : FieldSelectorResult.NO_LOAD;
> + }
> +
> +}
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,53 @@
> +package org.apache.lucene.document;
> +
> +import java.util.Set;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Apr 14, 2006
> + * Time: 6:53:07 PM
> + * $Id:$
> + * Copyright 2005. Center For Natural Language Processing
> + */
> +
> +/**
> + * Declare what fields to load normally and what fields to load lazily
> + *
> + **/
> +public class SetBasedFieldSelector implements FieldSelector {
> +
> + private Set fieldsToLoad;
> + private Set lazyFieldsToLoad;
> +
> +
> +
> + /**
> + * Pass in the Set of {@link Field} names to load and the Set of {@link Field} names to load lazily. If both are null, the
> + * Document will not have any {@link Field} on it.
> + * @param fieldsToLoad A Set of {@link String} field names to load. May be empty, but not null
> + * @param lazyFieldsToLoad A Set of {@link String} field names to load lazily. May be empty, but not null
> + */
> + public SetBasedFieldSelector(Set fieldsToLoad, Set lazyFieldsToLoad) {
> + this.fieldsToLoad = fieldsToLoad;
> + this.lazyFieldsToLoad = lazyFieldsToLoad;
> + }
> +
> + /**
> + * Indicate whether to load the field with the given name or not. If the {@link Field#name()} is not in either of the
> + * initializing Sets, then {@link org.apache.lucene.document.FieldSelectorResult#NO_LOAD} is returned. If a Field name
> + * is in both <code>fieldsToLoad</code> and <code>lazyFieldsToLoad</code>, lazy has precedence.
> + *
> + * @param fieldName The {@link Field} name to check
> + * @return The {@link FieldSelectorResult}
> + */
> + public FieldSelectorResult accept(String fieldName) {
> + FieldSelectorResult result = FieldSelectorResult.NO_LOAD;
> + if (fieldsToLoad.contains(fieldName) == true){
> + result = FieldSelectorResult.LOAD;
> + }
> + if (lazyFieldsToLoad.contains(fieldName) == true){
> + result = FieldSelectorResult.LAZY_LOAD;
> + }
> + return result;
> + }
> +}
> \ No newline at end of file
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/document/SetBasedFieldSelector.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/DocumentWriter.java Fri Jun 9 18:23:22 2006
> @@ -16,22 +16,22 @@
> * limitations under the License.
> */
>
> +import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.analysis.Token;
> +import org.apache.lucene.analysis.TokenStream;
> +import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.Fieldable;
> +import org.apache.lucene.search.Similarity;
> +import org.apache.lucene.store.Directory;
> +import org.apache.lucene.store.IndexOutput;
> +
> import java.io.IOException;
> import java.io.PrintStream;
> import java.io.Reader;
> import java.io.StringReader;
> -import java.util.Hashtable;
> -import java.util.Enumeration;
> import java.util.Arrays;
> -
> -import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -import org.apache.lucene.analysis.Analyzer;
> -import org.apache.lucene.analysis.TokenStream;
> -import org.apache.lucene.analysis.Token;
> -import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexOutput;
> -import org.apache.lucene.search.Similarity;
> +import java.util.Enumeration;
> +import java.util.Hashtable;
>
> final class DocumentWriter {
> private Analyzer analyzer;
> @@ -129,7 +129,7 @@
> throws IOException {
> Enumeration fields = doc.fields();
> while (fields.hasMoreElements()) {
> - Field field = (Field) fields.nextElement();
> + Fieldable field = (Fieldable) fields.nextElement();
> String fieldName = field.name();
> int fieldNumber = fieldInfos.fieldNumber(fieldName);
>
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldInfos.java Fri Jun 9 18:23:22 2006
> @@ -16,18 +16,17 @@
> * limitations under the License.
> */
>
> -import java.util.*;
> -import java.io.IOException;
> -
> import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -
> +import org.apache.lucene.document.Fieldable;
> import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexOutput;
> import org.apache.lucene.store.IndexInput;
> +import org.apache.lucene.store.IndexOutput;
> +
> +import java.io.IOException;
> +import java.util.*;
>
> -/** Access to the Field Info file that describes document fields and whether or
> - * not they are indexed. Each segment has a separate Field Info file. Objects
> +/** Access to the Fieldable Info file that describes document fields and whether or
> + * not they are indexed. Each segment has a separate Fieldable Info file. Objects
> * of this class are thread-safe for multiple readers, but only one thread can
> * be adding documents at a time, with no other reader or writer threads
> * accessing this object.
> @@ -65,7 +64,7 @@
> public void add(Document doc) {
> Enumeration fields = doc.fields();
> while (fields.hasMoreElements()) {
> - Field field = (Field) fields.nextElement();
> + Fieldable field = (Fieldable) fields.nextElement();
> add(field.name(), field.isIndexed(), field.isTermVectorStored(), field.isStorePositionWithTermVector(),
> field.isStoreOffsetWithTermVector(), field.getOmitNorms());
> }
> @@ -105,7 +104,7 @@
> /**
> * Calls 5 parameter add with false for all TermVector parameters.
> *
> - * @param name The name of the Field
> + * @param name The name of the Fieldable
> * @param isIndexed true if the field is indexed
> * @see #add(String, boolean, boolean, boolean, boolean)
> */
>
> Added: lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java?rev=413201&view=auto
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java (added)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java Fri Jun 9 18:23:22 2006
> @@ -0,0 +1,70 @@
> +package org.apache.lucene.index;
> +/**
> + * Created by IntelliJ IDEA.
> + * User: Grant Ingersoll
> + * Date: Jan 12, 2006
> + * Time: 9:37:43 AM
> + * $Id:$
> + * Copyright 2005. Center For Natural Language Processing
> + */
> +
> +/**
> + *
> + *
> + **/
> +public class FieldReaderException extends RuntimeException{
> + /**
> + * Constructs a new runtime exception with <code>null</code> as its
> + * detail message. The cause is not initialized, and may subsequently be
> + * initialized by a call to {@link #initCause}.
> + */
> + public FieldReaderException() {
> + }
> +
> + /**
> + * Constructs a new runtime exception with the specified cause and a
> + * detail message of <tt>(cause==null ? null : cause.toString())</tt>
> + * (which typically contains the class and detail message of
> + * <tt>cause</tt>). This constructor is useful for runtime exceptions
> + * that are little more than wrappers for other throwables.
> + *
> + * @param cause the cause (which is saved for later retrieval by the
> + * {@link #getCause()} method). (A <tt>null</tt> value is
> + * permitted, and indicates that the cause is nonexistent or
> + * unknown.)
> + * @since 1.4
> + */
> + public FieldReaderException(Throwable cause) {
> + super(cause);
> + }
> +
> + /**
> + * Constructs a new runtime exception with the specified detail message.
> + * The cause is not initialized, and may subsequently be initialized by a
> + * call to {@link #initCause}.
> + *
> + * @param message the detail message. The detail message is saved for
> + * later retrieval by the {@link #getMessage()} method.
> + */
> + public FieldReaderException(String message) {
> + super(message);
> + }
> +
> + /**
> + * Constructs a new runtime exception with the specified detail message and
> + * cause. <p>Note that the detail message associated with
> + * <code>cause</code> is <i>not</i> automatically incorporated in
> + * this runtime exception's detail message.
> + *
> + * @param message the detail message (which is saved for later retrieval
> + * by the {@link #getMessage()} method).
> + * @param cause the cause (which is saved for later retrieval by the
> + * {@link #getCause()} method). (A <tt>null</tt> value is
> + * permitted, and indicates that the cause is nonexistent or
> + * unknown.)
> + * @since 1.4
> + */
> + public FieldReaderException(String message, Throwable cause) {
> + super(message, cause);
> + }
> +}
>
> Propchange: lucene/java/trunk/src/java/org/apache/lucene/index/FieldReaderException.java
> ------------------------------------------------------------------------------
> svn:executable = *
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java Fri Jun 9 18:23:22 2006
> @@ -16,19 +16,19 @@
> * limitations under the License.
> */
>
> +import org.apache.lucene.document.*;
> +import org.apache.lucene.store.Directory;
> +import org.apache.lucene.store.IndexInput;
> +
> import java.io.ByteArrayOutputStream;
> import java.io.IOException;
> +import java.io.Reader;
> import java.util.zip.DataFormatException;
> import java.util.zip.Inflater;
>
> -import org.apache.lucene.document.Document;
> -import org.apache.lucene.document.Field;
> -import org.apache.lucene.store.Directory;
> -import org.apache.lucene.store.IndexInput;
> -
> /**
> * Class responsible for access to stored document fields.
> - *
> + * <p/>
> * It uses <segment>.fdt and <segment>.fdx; files.
> *
> * @version $Id$
> @@ -39,25 +39,37 @@
> private IndexInput indexStream;
> private int size;
>
> + private static ThreadLocal fieldsStreamTL = new ThreadLocal();
> +
> FieldsReader(Directory d, String segment, FieldInfos fn) throws IOException {
> fieldInfos = fn;
>
> fieldsStream = d.openInput(segment + ".fdt");
> indexStream = d.openInput(segment + ".fdx");
> -
> - size = (int)(indexStream.length() / 8);
> + size = (int) (indexStream.length() / 8);
> }
>
> + /**
> + * Cloeses the underlying {@link org.apache.lucene.store.IndexInput} streams, including any ones associated with a
> + * lazy implementation of a Field. This means that the Fields values will not be accessible.
> + *
> + * @throws IOException
> + */
> final void close() throws IOException {
> fieldsStream.close();
> indexStream.close();
> + IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> + if (localFieldsStream != null) {
> + localFieldsStream.close();
> + fieldsStreamTL.set(null);
> + }
> }
>
> final int size() {
> return size;
> }
>
> - final Document doc(int n) throws IOException {
> + final Document doc(int n, FieldSelector fieldSelector) throws IOException {
> indexStream.seek(n * 8L);
> long position = indexStream.readLong();
> fieldsStream.seek(position);
> @@ -67,89 +79,277 @@
> for (int i = 0; i < numFields; i++) {
> int fieldNumber = fieldsStream.readVInt();
> FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);
> -
> - byte bits = fieldsStream.readByte();
> + FieldSelectorResult acceptField = fieldSelector == null ? FieldSelectorResult.LOAD : fieldSelector.accept(fi.name);
> + boolean lazy = acceptField.equals(FieldSelectorResult.LAZY_LOAD) == true;
>
> + byte bits = fieldsStream.readByte();
> boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
> boolean tokenize = (bits & FieldsWriter.FIELD_IS_TOKENIZED) != 0;
> -
> - if ((bits & FieldsWriter.FIELD_IS_BINARY) != 0) {
> - final byte[] b = new byte[fieldsStream.readVInt()];
> - fieldsStream.readBytes(b, 0, b.length);
> - if (compressed)
> - doc.add(new Field(fi.name, uncompress(b), Field.Store.COMPRESS));
> - else
> - doc.add(new Field(fi.name, b, Field.Store.YES));
> + boolean binary = (bits & FieldsWriter.FIELD_IS_BINARY) != 0;
> + if (acceptField.equals(FieldSelectorResult.LOAD) == true) {
> + addField(doc, fi, binary, compressed, tokenize);
> }
> + else if (acceptField.equals(FieldSelectorResult.LOAD_AND_BREAK) == true){
> + addField(doc, fi, binary, compressed, tokenize);
> + break;//Get out of this loop
> + }
> + else if (lazy == true){
> + addFieldLazy(doc, fi, binary, compressed, tokenize);
> + }
> else {
> - Field.Index index;
> - Field.Store store = Field.Store.YES;
> -
> - if (fi.isIndexed && tokenize)
> - index = Field.Index.TOKENIZED;
> - else if (fi.isIndexed && !tokenize)
> - index = Field.Index.UN_TOKENIZED;
> - else
> - index = Field.Index.NO;
> -
> - Field.TermVector termVector = null;
> - if (fi.storeTermVector) {
> - if (fi.storeOffsetWithTermVector) {
> - if (fi.storePositionWithTermVector) {
> - termVector = Field.TermVector.WITH_POSITIONS_OFFSETS;
> - }
> - else {
> - termVector = Field.TermVector.WITH_OFFSETS;
> - }
> - }
> - else if (fi.storePositionWithTermVector) {
> - termVector = Field.TermVector.WITH_POSITIONS;
> - }
> - else {
> - termVector = Field.TermVector.YES;
> - }
> - }
> - else {
> - termVector = Field.TermVector.NO;
> - }
> -
> - if (compressed) {
> - store = Field.Store.COMPRESS;
> - final byte[] b = new byte[fieldsStream.readVInt()];
> - fieldsStream.readBytes(b, 0, b.length);
> - Field f = new Field(fi.name, // field name
> - new String(uncompress(b), "UTF-8"), // uncompress the value and add as string
> - store,
> - index,
> - termVector);
> - f.setOmitNorms(fi.omitNorms);
> - doc.add(f);
> - }
> - else {
> - Field f = new Field(fi.name, // name
> + skipField(binary, compressed);
> + }
> + }
> +
> + return doc;
> + }
> +
> + /**
> + * Skip the field. We still have to read some of the information about the field, but can skip past the actual content.
> + * This will have the most payoff on large fields.
> + */
> + private void skipField(boolean binary, boolean compressed) throws IOException {
> +
> + int toRead = fieldsStream.readVInt();
> +
> + if (binary || compressed) {
> + long pointer = fieldsStream.getFilePointer();
> + fieldsStream.seek(pointer + toRead);
> + } else {
> + //We need to skip chars. This will slow us down, but still better
> + fieldsStream.skipChars(toRead);
> + }
> + }
> +
> + private void addFieldLazy(Document doc, FieldInfo fi, boolean binary, boolean compressed, boolean tokenize) throws IOException {
> + if (binary == true) {
> + int toRead = fieldsStream.readVInt();
> + long pointer = fieldsStream.getFilePointer();
> + if (compressed) {
> + //was: doc.add(new Fieldable(fi.name, uncompress(b), Fieldable.Store.COMPRESS));
> + doc.add(new LazyField(fi.name, Field.Store.COMPRESS, toRead, pointer));
> + } else {
> + //was: doc.add(new Fieldable(fi.name, b, Fieldable.Store.YES));
> + doc.add(new LazyField(fi.name, Field.Store.YES, toRead, pointer));
> + }
> + //Need to move the pointer ahead by toRead positions
> + fieldsStream.seek(pointer + toRead);
> + } else {
> + Field.Store store = Field.Store.YES;
> + Field.Index index = getIndexType(fi, tokenize);
> + Field.TermVector termVector = getTermVectorType(fi);
> +
> + Fieldable f;
> + if (compressed) {
> + store = Field.Store.COMPRESS;
> + int toRead = fieldsStream.readVInt();
> + long pointer = fieldsStream.getFilePointer();
> + f = new LazyField(fi.name, store, toRead, pointer);
> + //skip over the part that we aren't loading
> + fieldsStream.seek(pointer + toRead);
> + f.setOmitNorms(fi.omitNorms);
> + } else {
> + int length = fieldsStream.readVInt();
> + long pointer = fieldsStream.getFilePointer();
> + //Skip ahead of where we are by the length of what is stored
> + fieldsStream.skipChars(length);
> + f = new LazyField(fi.name, store, index, termVector, length, pointer);
> + f.setOmitNorms(fi.omitNorms);
> + }
> + doc.add(f);
> + }
> +
> + }
> +
> + private void addField(Document doc, FieldInfo fi, boolean binary, boolean compressed, boolean tokenize) throws IOException {
> +
> + //we have a binary stored field, and it may be compressed
> + if (binary) {
> + int toRead = fieldsStream.readVInt();
> + final byte[] b = new byte[toRead];
> + fieldsStream.readBytes(b, 0, b.length);
> + if (compressed)
> + doc.add(new Field(fi.name, uncompress(b), Field.Store.COMPRESS));
> + else
> + doc.add(new Field(fi.name, b, Field.Store.YES));
> +
> + } else {
> + Field.Store store = Field.Store.YES;
> + Field.Index index = getIndexType(fi, tokenize);
> + Field.TermVector termVector = getTermVectorType(fi);
> +
> + Fieldable f;
> + if (compressed) {
> + store = Field.Store.COMPRESS;
> + int toRead = fieldsStream.readVInt();
> +
> + final byte[] b = new byte[toRead];
> + fieldsStream.readBytes(b, 0, b.length);
> + f = new Field(fi.name, // field name
> + new String(uncompress(b), "UTF-8"), // uncompress the value and add as string
> + store,
> + index,
> + termVector);
> + f.setOmitNorms(fi.omitNorms);
> + } else {
> + f = new Field(fi.name, // name
> fieldsStream.readString(), // read value
> store,
> index,
> termVector);
> - f.setOmitNorms(fi.omitNorms);
> - doc.add(f);
> + f.setOmitNorms(fi.omitNorms);
> + }
> + doc.add(f);
> + }
> + }
> +
> + private Field.TermVector getTermVectorType(FieldInfo fi) {
> + Field.TermVector termVector = null;
> + if (fi.storeTermVector) {
> + if (fi.storeOffsetWithTermVector) {
> + if (fi.storePositionWithTermVector) {
> + termVector = Field.TermVector.WITH_POSITIONS_OFFSETS;
> + } else {
> + termVector = Field.TermVector.WITH_OFFSETS;
> }
> + } else if (fi.storePositionWithTermVector) {
> + termVector = Field.TermVector.WITH_POSITIONS;
> + } else {
> + termVector = Field.TermVector.YES;
> }
> + } else {
> + termVector = Field.TermVector.NO;
> }
> + return termVector;
> + }
>
> - return doc;
> + private Field.Index getIndexType(FieldInfo fi, boolean tokenize) {
> + Field.Index index;
> + if (fi.isIndexed && tokenize)
> + index = Field.Index.TOKENIZED;
> + else if (fi.isIndexed && !tokenize)
> + index = Field.Index.UN_TOKENIZED;
> + else
> + index = Field.Index.NO;
> + return index;
> }
> -
> +
> + /**
> + * A Lazy implementation of Fieldable that differs loading of fields until asked for, instead of when the Document is
> + * loaded.
> + */
> + private class LazyField extends AbstractField implements Fieldable {
> + private int toRead;
> + private long pointer;
> + //internal buffer
> + private char[] chars;
> +
> +
> + public LazyField(String name, Field.Store store, int toRead, long pointer) {
> + super(name, store, Field.Index.NO, Field.TermVector.NO);
> + this.toRead = toRead;
> + this.pointer = pointer;
> + lazy = true;
> + }
> +
> + public LazyField(String name, Field.Store store, Field.Index index, Field.TermVector termVector, int toRead, long pointer) {
> + super(name, store, index, termVector);
> + this.toRead = toRead;
> + this.pointer = pointer;
> + lazy = true;
> + }
> +
> + /**
> + * The value of the field in Binary, or null. If null, the Reader or
> + * String value is used. Exactly one of stringValue(), readerValue() and
> + * binaryValue() must be set.
> + */
> + public byte[] binaryValue() {
> + if (fieldsData == null) {
> + final byte[] b = new byte[toRead];
> + IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> + if (localFieldsStream == null) {
> + localFieldsStream = (IndexInput) fieldsStream.clone();
> + fieldsStreamTL.set(localFieldsStream);
> + }
> + //Throw this IO Exception since IndexREader.document does so anyway, so probably not that big of a change for people
> + //since they are already handling this exception when getting the document
> + try {
> + localFieldsStream.seek(pointer);
> + localFieldsStream.readBytes(b, 0, b.length);
> + if (isCompressed == true) {
> + fieldsData = uncompress(b);
> + } else {
> + fieldsData = b;
> + }
> + } catch (IOException e) {
> + throw new FieldReaderException(e);
> + }
> + }
> + return fieldsData instanceof byte[] ? (byte[]) fieldsData : null;
> + }
> +
> + /**
> + * The value of the field as a Reader, or null. If null, the String value
> + * or binary value is used. Exactly one of stringValue(), readerValue(),
> + * and binaryValue() must be set.
> + */
> + public Reader readerValue() {
> + return fieldsData instanceof Reader ? (Reader) fieldsData : null;
> + }
> +
> + /**
> + * The value of the field as a String, or null. If null, the Reader value
> + * or binary value is used. Exactly one of stringValue(), readerValue(), and
> + * binaryValue() must be set.
> + */
> + public String stringValue() {
> + if (fieldsData == null) {
> + IndexInput localFieldsStream = (IndexInput) fieldsStreamTL.get();
> + if (localFieldsStream == null) {
> + localFieldsStream = (IndexInput) fieldsStream.clone();
> + fieldsStreamTL.set(localFieldsStream);
> + }
> + try {
> + localFieldsStream.seek(pointer);
> + //read in chars b/c we already know the length we need to read
> + if (chars == null || toRead > chars.length)
> + chars = new char[toRead];
> + localFieldsStream.readChars(chars, 0, toRead);
> + fieldsData = new String(chars, 0, toRead);//fieldsStream.readString();
> + } catch (IOException e) {
> + throw new FieldReaderException(e);
> + }
> + }
> + return fieldsData instanceof String ? (String) fieldsData : null;
> + }
> +
> + public long getPointer() {
> + return pointer;
> + }
> +
> + public void setPointer(long pointer) {
> + this.pointer = pointer;
> + }
> +
> + public int getToRead() {
> + return toRead;
> + }
> +
> + public void setToRead(int toRead) {
> + this.toRead = toRead;
> + }
> + }
> +
> private final byte[] uncompress(final byte[] input)
> - throws IOException
> - {
> -
> + throws IOException {
> +
> Inflater decompressor = new Inflater();
> decompressor.setInput(input);
> -
> +
> // Create an expandable byte array to hold the decompressed data
> ByteArrayOutputStream bos = new ByteArrayOutputStream(input.length);
> -
> +
> // Decompress the data
> byte[] buf = new byte[1024];
> while (!decompressor.finished()) {
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/FilterIndexReader.java Fri Jun 9 18:23:22 2006
> @@ -17,6 +17,8 @@
> */
>
> import org.apache.lucene.document.Document;
> +import org.apache.lucene.document.FieldSelector;
> +
>
> import java.io.IOException;
> import java.util.Collection;
> @@ -100,7 +102,7 @@
> public int numDocs() { return in.numDocs(); }
> public int maxDoc() { return in.maxDoc(); }
>
> - public Document document(int n) throws IOException { return in.document(n); }
> + public Document document(int n, FieldSelector fieldSelector) throws IOException { return in.document(n, fieldSelector); }
>
> public boolean isDeleted(int n) { return in.isDeleted(n); }
> public boolean hasDeletions() { return in.hasDeletions(); }
> @@ -133,7 +135,7 @@
> protected void doCommit() throws IOException { in.commit(); }
> protected void doClose() throws IOException { in.close(); }
>
> -
> +
> public Collection getFieldNames(IndexReader.FieldOption fieldNames) {
> return in.getFieldNames(fieldNames);
> }
>
> Modified: lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java
> URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java?rev=413201&r1=413200&r2=413201&view=diff
> ==============================================================================
> --- lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java (original)
> +++ lucene/java/trunk/src/java/org/apache/lucene/index/IndexModifier.java Fri Jun 9 18:23:22 2006
> @@ -273,7 +273,7 @@
> }
> }
>
> -
> +
> /**
> * Returns the number of documents currently in this index.
> * @see IndexWriter#docCount()
> @@ -407,7 +407,7 @@
> * the number of files open in a FSDirectory.
> *
> * <p>The default value is 10.
> - *
> + *
> * @see IndexWriter#setMaxBufferedDocs(int)
> * @throws IllegalStateException if the index is closed
> * @throws IllegalArgumentException if maxBufferedDocs is smaller than 2
> @@ -500,8 +500,8 @@
> // create an index in /tmp/index, overwriting an existing one:
> IndexModifier indexModifier = new IndexModifier("/tmp/index", analyzer, true);
> Document doc = new Document();
> - doc.add(new Field("id", "1", Field.Store.YES, Field.Index.UN_TOKENIZED));
> - doc.add(new Field("body", "a simple test", Field.Store.YES, Field.Index.TOKENIZED));
> + doc.add(new Fieldable("id", "1", Fieldable.Store.YES, Fieldable.Index.UN_TOKENIZED));
> + doc.add(new Fieldable("body", "a simple test", Fieldable.Store.YES, Fieldable.Index.TOKENIZED));
> indexModifier.addDocument(doc);
> int deleted = indexModifier.delete(new Term("id", "1"));
> System.out.println("Deleted " + deleted + " document");
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
--
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|