lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <mkhlud...@griddynamics.com>
Subject Re: update fails if one doc is wrong
Date Wed, 27 Feb 2013 10:32:42 GMT
Colleagues,

Here are my considerations

If the exception is occurs somewhere in updateprocessor we can add a
special update processor on top of the head of update processor chain,
which will catch exception from delegated processAdd call, log and/or
swallow it.
If it fits for the purpose we can try to figure out how to return failed
doc ids back to the client. I'm not sure but i think it's possible. Just
because responsewrite is quite -dumb- flexible, i e if update processor
drops something to response, it should be blindly streamed back to the
client.

One more consideration.
Anirudha,
When you say "re-try them" do you mean to post a failed doc one more time?
It seems I didn't get your point. Please clarify.
 27.02.2013 1:13 пользователь "Anirudha Jadhav" <anirudha@nyu.edu> написал:

> Ideally you would want to use SOLRJ or other interface which can catch
> exceptions/error and re-try them.
>
>
> On Tue, Feb 26, 2013 at 3:45 PM, Walter Underwood <wunder@wunderwood.org
> >wrote:
>
> > I've done exactly the same thing. On error, set the batch size to one and
> > try again.
> >
> > wunder
> >
> > On Feb 26, 2013, at 12:27 PM, Timothy Potter wrote:
> >
> > > Here's what I do to work-around failures when processing batches of
> > updates:
> > >
> > > On client side, catch the exception that the batch failed. In the
> > > exception handler, switch to one-by-one mode for the failed batch
> > > only.
> > >
> > > This allows you to isolate the *bad* documents as well as getting the
> > > *good* documents in the batch indexed in Solr.
> > >
> > > This assumes most batches work so you only pay the one-by-one penalty
> > > for the occasional batch with a bad doc.
> > >
> > > Tim
> > >
> > > On Tue, Feb 26, 2013 at 12:08 PM, Isaac Hebsh <isaac.hebsh@gmail.com>
> > wrote:
> > >> Hi.
> > >>
> > >> I add documents to Solr by POSTing them to UpdateHandler, as bulks of
> > <add>
> > >> commands (DIH is not used).
> > >>
> > >> If one document contains any invalid data (e.g. string data into
> numeric
> > >> field), Solr returns HTTP 400 Bad Request, and the whole bulk is
> failed.
> > >>
> > >> I'm searching for a way to tell Solr to accept the rest of the
> > documents...
> > >> (I'll use RealTimeGet to determine which documents were added).
> > >>
> > >> If there is no standard way for doing it, maybe it can be implemented
> by
> > >> spiltting the <add> commands into seperate HTTP POSTs. Because of
> using
> > >> auto-soft-commit, can I say that it is almost equivalent? What is the
> > >> performance penalty of 100 POST requests (of 1 document each) againt 1
> > >> request of 100 docs, if a soft commit is eventually done.
> > >>
> > >> Thanks in advance...
> >
> > --
> > Walter Underwood
> > wunder@wunderwood.org
> >
> >
> >
> >
>
>
> --
> Anirudha P. Jadhav
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message