lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: API cleanup for Field
Date Mon, 30 Aug 2004 16:54:51 GMT
This looks much nicer.  I'm for deprecating all the Field.* factory
methods and switching to this new API fully, and including that
TV/TVector/Vector/TermVector enum akin to Store/Index inner class.

Nitpick: some exception messages start with upper case, some don't. 
Some end with a full stop, others do not.

+1

Otis

--- Daniel Naber <daniel.naber@t-online.de> wrote:

> Hi,
> 
> here's a patch to clean up the API as described by Doug:
>
http://www.mail-archive.com/lucene-user%40jakarta.apache.org/msg08479.html
> 
> The Field constructor with three booleans is deprecated because it's
> too 
> easy to mix up the order of those parameters. Also, one variation of 
> Field.Text() is deprecated because it behaves different depending on
> if 
> you pass it a String or a StringReader, which is very confusing.
> 
> The static Field.UnStored/Keyword etc methods are not deprecated. I
> think 
> these can be confusing (e.g. what exactly does UnIndexed mean -- I
> always 
> have to look it up), but nobody is forced to use them so there's no
> reason 
> to deprecate them.
> 
> The only boolean left is the one for term vectors. Should we add
> another 
> enumeration like TermVectorIndex.NO/YES/...? I know that there's a
> patch 
> that adds position information to the term vectors. How does that fit
> in 
> here?
> 
> Regards
>  Daniel
> 
> -- 
> http://www.danielnaber.de
> > Index: Field.java
> ===================================================================
> RCS file:
>
/home/cvs/jakarta-lucene/src/java/org/apache/lucene/document/Field.java,v
> retrieving revision 1.16
> diff -u -r1.16 Field.java
> --- Field.java	17 Aug 2004 20:22:33 -0000	1.16
> +++ Field.java	27 Aug 2004 23:13:18 -0000
> @@ -40,6 +40,50 @@
>    private boolean isTokenized = true;
>  
>    private float boost = 1.0f;
> +  
> +  public static final class Store {
> +    private String name;
> +    private Store() {}
> +    private Store(String name) {
> +      this.name = name;
> +    }
> +    public String toString() {
> +      return name;
> +    }
> +    /** Store the original field value in the index. This is useful
> for short texts
> +     * like a document's title which whould be displayed with the
> results. The
> +     * value is stored in its original form, i.e. no analyzer is
> used before it is
> +     * stored. 
> +     */
> +    public static final Store YES = new Store("YES");
> +    /** Do not store the field value in the index. */
> +    public static final Store NO = new Store("NO");
> +  }
> +  
> +  public static final class Index {
> +    private String name;
> +    private Index() {}
> +    private Index(String name) {
> +      this.name = name;
> +    }
> +    public String toString() {
> +      return name;
> +    }
> +    /** Do not index the field value. This field can thus not be
> searched,
> +     * but one can still access its contents provided it is 
> +     * {@link Field.Store stored}. */
> +    public static final Index NO = new Index("NO");
> +    /** Index the field's value so it can be searched. An Analyzer
> will be used
> +     * to tokenize and possibly further normalize the text before
> its
> +     * terms will be stored in the index. This is useful for common
> text.
> +     */
> +    public static final Index TOKENIZED = new Index("TOKENIZED");
> +    /** Index the field's value without using an Analyzer, so it can
> be searched.
> +     * As no analyzer is used the value will be stored as a single
> term. This is 
> +     * useful for unique Ids like product numbers.
> +     */
> +    public static final Index UN_TOKENIZED = new
> Index("UN_TOKENIZED");
> +  }
>  
>    /** Sets the boost factor hits on this field.  This value will be
>     * multiplied into the score of all hits on this this field of
> this
> @@ -91,7 +135,9 @@
>  
>    /** Constructs a String-valued Field that is tokenized and
> indexed,
>      and is stored in the index, for return with hits.  Useful for
> short text
> -    fields, like "title" or "subject". Term vector will not be
> stored for this field. */
> +    fields, like "title" or "subject". Term vector will not be
> stored for this field.
> +  @deprecated use {@link #Field(String, String, Field.Store,
> Field.Index)
> +    Field(name, value, Field.Store.YES, Field.Index.TOKENIZED)}
> instead */
>    public static final Field Text(String name, String value) {
>      return Text(name, value, false);
>    }
> @@ -104,7 +150,9 @@
>  
>    /** Constructs a String-valued Field that is tokenized and
> indexed,
>      and is stored in the index, for return with hits.  Useful for
> short text
> -    fields, like "title" or "subject". */
> +    fields, like "title" or "subject".
> +    @deprecated use {@link #Field(String, String, Field.Store,
> Field.Index, boolean)
> +      Field(name, value, Field.Store.YES, Field.Index.TOKENIZED,
> boolean)} instead */
>    public static final Field Text(String name, String value, boolean
> storeTermVector) {
>      return new Field(name, value, true, true, true,
> storeTermVector);
>    }
> @@ -152,6 +200,63 @@
>    /** Create a field by specifying all parameters except for
> <code>storeTermVector</code>,
>     *  which is set to <code>false</code>.
>     */
> +  public Field(String name, String string, Store store, Index index)
> {
> +    this(name, string, store, index, false);
> +  }
> +
> +  /**
> +   * Create a field by specifying its name, value and how it will
> +   * be saved in the index.
> +   * 
> +   * @param name The name of the field
> +   * @param string The string to process
> +   * @param store whether <code>string</code> should be stored in
> the index
> +   * @param index whether the field should be indexed, and if so, if
> it should
> +   *  be tokenized before indexing 
> +   * @param storeTermVector true if we should store the Term Vector
> info
> +   * @throws IllegalArgumentException if
> <code>storeTermVector</code> is
> +   *  <code>true</code> but the field is not indexed or if the field
> +   *  is neither stored nor indexed 
> +   */ 
> +  public Field(String name, String string, Store store, Index index,
> boolean storeTermVector) {
> +      if (name == null)
> +         throw new IllegalArgumentException("name cannot be null");
> +      if (string == null)
> +        throw new IllegalArgumentException("value cannot be null");
> +      if (index == Index.NO && storeTermVector)
> +        throw new IllegalArgumentException("cannot store a term
> vector for fields that are not indexed.");
> +      if (index == Index.NO && store == Store.NO)
> +        throw new IllegalArgumentException("it doesn't make sense to
> have a field that "
> +            + "is neither indexed nor stored");
> +
> +      this.name = name.intern();        // field names are interned
> +      this.stringValue = string;
> +      if (store == Store.YES)
> +        this.isStored = true;
> +      else if (store == Store.NO)
> +        this.isStored = false;
> +      else
> +        throw new IllegalArgumentException("Unknown store parameter
> " + store);
> +      
> +      if (index == Index.NO) {
> +        this.isIndexed = false;
> +        this.isTokenized = false;
> +      } else if (index == Index.TOKENIZED) {
> +        this.isIndexed = true;
> +        this.isTokenized = true;
> +      } else if (index == Index.UN_TOKENIZED) {
> +        this.isIndexed = true;
> +        this.isTokenized = false;
> +      } else {
> +        throw new IllegalArgumentException("Unknown index parameter
> " + index);
> +      }
> +      this.storeTermVector = storeTermVector;
> +}
> +
> +  /** Create a field by specifying all parameters except for
> <code>storeTermVector</code>,
> +   *  which is set to <code>false</code>.
> +   * @deprecated use {@link #Field(String, String, Field.Store,
> Field.Index)} instead
> +   */
>    public Field(String name, String string,
>  	       boolean store, boolean index, boolean token) {
>      this(name, string, store, index, token, false);
> @@ -165,6 +270,7 @@
>     * @param index true if the field should be indexed
>     * @param token true if the field should be tokenized
>     * @param storeTermVector true if we should store the Term Vector
> info
> +   * @deprecated use {@link #Field(String, String, Field.Store,
> Field.Index, boolean)} instead
>     */ 
>    public Field(String name, String string,
>  	       boolean store, boolean index, boolean token, boolean
> storeTermVector) {
> 
> >
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message