lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@yahoo.com>
Subject RE: new Document class using collection framework
Date Sun, 28 Nov 2004 18:55:31 GMT
--- Chuck Williams <chuck@mamanawizom> wrote:
> And you can't extend
> ArArrayListield>
> since this would require jajavato work.

Yes, your right.  I realized after I posted that one
wouldn't be able to get the "for (Field f: doc)" to
work without breaking java1.4.

My main motivation was due to the lack of a way to
efficiently check for multivalued fields (a field that
appears more than once).  To do that, I needed to get
a Field[] efficiently, and I couldn't even do that.

Even without being able to use generics, it would
still be nice to have Document implement standard
collection interfaces.  The Document class could
always be transparently upgraded to use generics at a
later date when Java 5 becomes more standard.

Here is the (ugly) code I currently need to handle
multivalued fields:

        Document doc = hits.doc(recNum);
        Enumeration ee = doc.fields();
        ArrayList lst = new ArrayList();
        while (ee.hasMoreElements()) {
          Field ff = (Field) ee.nextElement();
          lst.add(ff);
        }
        Collections.sort(lst, fieldComparator);

        int sz = lst.size();
        int fidx1 = 0, fidx2 = 0;
        while (fidx1 < sz) {
          Field f1 = (Field)lst.get(fidx1);
          String fname = f1.name();

          fidx2 = fidx1+1;
          while (fidx2 < sz &&
fname.equals(((Field)lst.get(fidx2)).name()) ) {
            fidx2++;
          }
          if (fidx1+1 == fidx2) {
            // handle single field value
            [...]
          } else {
            // multiple fields with same value
detected
            for (int i=fidx1; i<fidx2; i++) {
              [...]
            }
          }
            fidx1 = fidx2;
        }


Even if people decide it's not a good idea to
implement collection interfaces such as List, I think
Document needs some methods to enable more efficient
usage.

definitely needed:
  Field[] toArray(Field[] target);
  int size();

maybe needed:
  Document(Field[] fields);
  Document(Field[] fields, int start, int len);

might be useful:
  Field[] getFields(String name, Field[] target)
  // null terminate the array

Also, why is Vector used internally... the
synchronization isn't needed is it?  A Field[] would
be the most efficient as it would remove the need for
dynamic casts.

-Yonik

>   > -----Original Message-----
>   > From: YoYonikeeley [mamailtosyseeleyahoo.com]
>   > Sent: Wednesday, November 24, 2004 1:41 PM
>   > To: luluceneedevajakartapache.ororg>   >
Subject: new Document class using collection
> framework
>   > 
>   > The current document class is not that friendly
> if you
>   > are trying to do things efficiently...
>   >  - uses Vector, which is synchronized
>   >  - can't use jajava"for (Field f : doc)"
>   >  - can't get size()
>   >  - can't get all fields into a Field[]
>   >  - can't use with any generic code that works on
>   > Collections or Lists
>   > 
>   > Proposal: Document should implement the List or
>   > Collection interface
>   > 
>   > Example: here is a quick example that inherits
> from
>   > ArArrayListo quickly implement the needed List
>   > functionality.  It should be completely backward
>   > compatible.  If there is interest, I can flesh
> it out
>   > a bit more...
>   > 
>   > -YoYonik>   > 
>   > 
>   > import ororgpache.luluceneocument.Field;
>   > import jajavatutil;
>   > 
>   > import jajavatutilnumeration;
>   > import jajavatutilterator;
>   > import jajavatutilist;
>   > import jajavatutilrArrayList
>   > 
>   > public final class Document extends ArArrayList>
  > implements List {
>   >     private float boost = 1.0f;
>   > 
>   >     /** Constructs a new document with no
> fields. */
>   >     public Document() {}
>   > 
>   > 
>   >     // non List methods
>   >     public void sesetBoostloat boost) {
>   >       this.boost = boost;
>   >     }
>   > 
>   >     public float gegetBoost {
>   >       return boost;
>   >     }
>   > 
>   >     //
>   >     // all methods after this are provided for
>   >     // backward cocompatabilityith the old
> Document
>   >     //
>   >     public final void reremoveFieldtring name) {
>   >       Iterator it = iterator();
>   >       while (it.hahasNext) {
>   >         Field field = (Field)it.next();
>   >         if (field.name().equals(name)) {
>   >           it.remove();
>   >           return;
>   >         }
>   >       }
>   >     }
>   > 
>   >     public final void reremoveFieldstring name)
> {
>   >       Iterator it = iterator();
>   >       while (it.hahasNext) {
>   >         Field field = (Field)it.next();
>   >         if (field.name().equals(name)) {
>   >           it.remove();
>   >         }
>   >       }
>   >     }
>   > 
>   >     public final Field gegetFieldtring name) {
>   >       for (int i = 0; i < size(); i++) {
>   >         Field field = (Field)get(i);
>   >         if (field.name().equals(name))
>   >     return field;
>   >       }
>   >       return null;
>   >     }
>   > 
>   > 
>   >     public final String get(String name) {
>   >       Field field = gegetFieldame);
>   >       if (field != null)
>   >         return field.ststringValue;
>   >       else
>   >         return null;
>   >     }
>   > 
>   >     public final Enumeration fields() {
>   >       return Collections.enumeration(this);
>   >     }
>   > 
>   >      public final Field[] gegetFieldstring name)
> {
>   >        List result = new ArArrayList;
>   >        for (int i = 0; i < size(); i++) {
>   >          Field field = (Field)get(i);
>   >          if (field.name().equals(name)) {
>   >            result.add(field);
>   >          }
>   >        }
>   > 
>   >        if (result.size() == 0)
>   >          return null;
>   > 
>   >        return (Field[])result.totoArrayew
>   > Field[result.size()]);
>   >      }
>   > 
>   >     public final String[] gegetValuestring name)
> {
>   >       Field[] nanamedFields gegetFieldsame);
>   >       if (nanamedFields= null)
>   >         return null;
>   >       String[] values = new
>   > String[nanamedFieldsength];
>   >       for (int i = 0; i < nanamedFieldsength;
> i++) {
>   >         values[i] =
> nanamedFields].ststringValue;
>   >       }
>   >       return values;
>   >     }
>   > 
>   >     public final String totoString {
>   >       StStringBufferuffer = new StStringBuffer;
>   >       buffer.append("Document<");
>   >       for (int i = 0; i < size(); i++) {
>   >         Field field = (Field)get(i);
>   >         buffer.append(field.totoString);
>   >         if (i != size()-1)
>   >           buffer.append(" ");
>   >       }
>   >       buffer.append(">");
>   >       return buffer.totoString;
>   >     }
>   > 
>   > }
>   > 
>   > 
>   > 
>   > 
>   > __________________________________
>   > Do you Yahoo!?
>   > The all-new My Yahoo! - Get yours free!
>   > hthttp/my.yahoo.com
>   > 
>   > 
>   > 
>   >
>
---------------------------------------------------------------------
>   > To ununsubscribee-mail:
> luluceneedevnunsubscribeajakartapache.ororg>   > For
additional commands, e-mail:
> luluceneedevelp@jajakartapache.ororg> 
> 
>
---------------------------------------------------------------------
> To ununsubscribee-mail:
> luluceneedevnunsubscribeajakartapache.ororg> For
additional commands, e-mail:
> luluceneedevelp@jajakartapache.ororg> 
> 


		
__________________________________ 
Do you Yahoo!? 
Take Yahoo! Mail with you! Get it on your mobile phone. 
http://mobile.yahoo.com/maildemo 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message