lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr indexing plugin: skip single faulty document?
Date Sun, 23 Oct 2011 18:01:38 GMT
Some work has been done in this general area, see SOLR-445. That
might give you some pointers....

Best
Erick

On Mon, Oct 17, 2011 at 11:00 AM, samuele.mattiuzzo <samumatt@gmail.com> wrote:
> Hi all, as far as i know, when solr finds a faulty document (inside an xml
> containing let say 1000 docs) it skips the whole file and the indexing
> process exits with exception (am i correct?)
>
> I'm using a custom indexing plugin, and i can trap the exception. Instead of
> using "default" values if that exception is raised, i would like to skip the
> document raising the error (example: sometimes i try to insert a string
> inside a "string" field, but solr exits saying it's expecting a multiValued
> field... i guess it's because of some ascii chars within the text, something
> like \n or sort...) maybe logging it somewhere, and pass to the next one.
> We're indexing millions of them, and we don't care much if we loose 10-20%
> of them, so the best solution is skip the single faulty doc and continue
> with the rest.
>
> I guess i have to work on the super.processAdd() call, but i don't know
> where i can find info about it. Can anybody help me? Is there a book talking
> about advanced solr plugin developement i could read?
>
> Thanks!
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-plugin-skip-single-faulty-document-tp3427646p3427646.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
View raw message