lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Update Plugins (was Re: Handling disparate data sources in Solr)
Date Mon, 15 Jan 2007 20:42:40 GMT

: >   SolrRequestHandler => SolrUpdateHandler
: >   SolrQueryRequest => SolrUpdateRequest
: >   SolrQueryResponse => SolrUpdateResponse (possibly the same class)
: >   QueryResponseWriter => UpdateResponseWriter (possible the same class)
: >
:
: Is there any reason the plugin system needs a different RequestObject
: for Query vs Update?

as i said: only to the extend that Updates tend to have streams of data
that queries don't need (as far as i can imagine)

: SolrRequest would be the current SolrQueryRequest augmented with the
: HTTP method type and a way to get the raw post stream.

the raw POST stream may not be where the data is though -- consider the
file upload case, or the reading from a local file case, or the reading
form a list of remote URLs specified in params.

: I'm not sure the nitty gritty, but it should be as close to
: HttpServletRequest as possible.  If possible, I think handlers should
: choose how to handle the stream.
:
: It it is a remote resource, I think its the handlers job to open the stream.

i disagree ... it should be possible to create "micro-plugins" (I think i
called them "UpdateSource" instances in my orriginal suggestion) that know
about getting streams in various ways, but don't care what format of data
is found on those streams -- that would be left for the
(Update)RequestHandler (which wouldn't need to know where the data came
from)

a JDBC/SQL updater would probably be a very special case -- where the
format and the stream are inheriently related -- in which case a No-Op
UpdateSource could be used that didn't provide any stream, and the
JdbcUpdateRequestHandler would manage it's JDBC streams directly.

: Likewise I don't see anything in QueryResponseWriter that should tie
: it to 'Query.'  Could it just be ResponseWriter?

probably -- as i said, both it and SolrQueryResponse could probably be
reused, the only hitch is that their names might be confusing (we could
allways refactor all of their guts into super classes, and deprecate the
existing classes)

: While we are at it... is there any reason (for or against) exposing
: other parts of the HttpServletRequest to SolrRequestHandlers?

the biggest one is Unit testing -- giving plugins very simple APIs that
don't require a lot of knowledge about external APIs make it much easier
to test them.  it also helps make it possible for use to "future proof"
plugins.  other messages in this thread have discussed the possibility of
changing the URL structure, supporting more restful URLs and things like
that ... if we currently exposed lots of info from the HttpServletRequest
in the SolrQueryRequest, then making changes like that in a backwards
compatible way would be nearly impossible.  As it stands, we can write a
new Servlet that deals with input *completely* differently from the
current URL structure, and be 99% certain that existing plugins will
continue to work.

: While it is not the focus of solr, someone (including me!) may want to
: implement some more complex authentication scheme - Perhaps setting a
: field on each document saying who added it and from what IP.
:
: stuff to consider: cookies, headers, remoteUser, remoteHost...

all of that could concievably be done by changing the servlet to add
that info into the SolrParams.



-Hoss


Mime
View raw message