lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan McKinley" <>
Subject Re: loading many documents by ID
Date Thu, 01 Feb 2007 20:46:29 GMT
> What does "distinct" mean in this context?

I am (was?) using DISTINCT to say, only add the unique fields.  As
implemented, it keeps a Collection<String> for each field name.  If
the 'mode' is 'DISTINCT' the collection is Set<String>, otherwise

> There is a lot of processing going on inside Document Builder.
> Once you get to the UpdateCommand, you have already lost some
> information (copyFields have executed, some things have been converted
> to index form, etc).

I noticed that!  It made sense when I was implementing this in a
RequestHandler, but it gets a little wonky inside the UpdateHandler -
as you said, copyFields already executed.

I think the best thing is to make a new command that does not directly
take a lucene document as its input.  perhaps:

Then the UpdateHandler would open the DocumentBuilder merge the
existing document with the passed in document using whatever method

> I would think one would also want to specify things per field.
> - append this value to this field
> - increment the value of this field
> - append this value to this field
> - overwrite this field

How would you feel about an interface like this:

public class IndexDocumentsCommand
  public enum MODE {
    APPEND,    // add the fields to existing fields
    OVERWRITE, // overwrite existing fields
    INCREMENT, // increment existing field
    DISTINCT   // same as APPEND, but make sure there are distinct values

  // optional id in "internal" indexed form... if it is needed and not supplied,
  // it will be obtained from the doc.
  public String indexedId;

  public Collection<SolrDocument> docs;
  public boolean allowDups;
  public boolean overwrite;
  public SimpleOrderedMap<MODE> modifyFieldMode; // What to do for
each field.  We should support *
  public int commitMaxTime = -1; // make sure these documents are
committed within this much time


View raw message