lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Oliver Glauner <patrick.oliver.glau...@cern.ch>
Subject RuntimeException during indexing: how to write document id to log?
Date Sun, 09 Sep 2012 12:04:18 GMT
Hello

We use Solr 3.1 and Jetty. I enabled logging in Jetty as described here: http://wiki.apache.org/solr/LoggingInDefaultJettySetup

We are indexing millions of documents and for some fulltexts we get exceptions and therefore
logs like this one:

<record>

  <date>2012-09-04T15:55:16</date>

  <millis>1346766916578</millis>

  <sequence>0</sequence>

  <logger>org.apache.solr.core.SolrCore</logger>

  <level>SEVERE</level>

  <class>org.apache.solr.common.SolrException</class>

  <method>log</method>

  <thread>10</thread>

  <message>java.lang.RuntimeException: [was class java.io.CharConversionException] Invalid
UTF-8 character 0xd835(a surrogate character)  at c

har #1144, byte #127)

        at com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18)

        at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731)

        at com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657)

        at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)

        at org.apache.solr.handler.XMLLoader.readDoc(XMLLoader.java:287)

        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:146)

        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)

        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)

        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)

        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)

        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)

        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)

        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)

        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)

        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)

        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)

        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)

        at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)

        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

        at org.mortbay.jetty.Server.handle(Server.java:326)

[...]

</message>

</record>


It would be great to also log the sent document id so that we can reindex the respective document.

How can we write the id to the log?


Thank you!


Best regards

Patrick Glauner

--
Patrick GLAUNER [patrick.oliver.glauner@cern.ch]

CERN
Information Technology Department
CH-1211 Geneva 23

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message