lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan McKinley" <ryan...@gmail.com>
Subject Re: loading many documents by ID
Date Thu, 01 Feb 2007 21:26:42 GMT
>
> Not sure... depends on how update handlers will use it...

by update handler, you mean UpdateRequestHandler(s)? or UpdateHandler?

> One thing we might not want to get rid of though is streaming
> (constructing and adding a document, then discarding it).  People are
> starting to add a lot of documents in a single XML request, and this
> will be much larger for CVS/SQL.
>

So you are uncomfortable with the Collection because you would have to
load all the documents before indexing them.  If this was many, it
could be a problem...

If UpdateHandler is going to take care of stuff like autocommit and
modifying documents, It seems best to have that apply to all the
documents you are going to modify as a unit.  For example, say i have
a SQL updater that will modify 100,000 documents incrementing field
'count_*' and replacing 'fl_*'.  If the DocumentCommand only applies
to a single document, it would have to match each field as it went
along rather then once when it starts.

How about: Iterable<SolrDocument>

this way, an UpdateRequestHandler can start the UpdateHandler running
while it streams each document from XML/CSV/SQL

ryan

Mime
View raw message