lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3775) Unexpected RuntimeException
Date Fri, 31 Aug 2012 05:45:08 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445694#comment-13445694
] 

Uwe Schindler commented on SOLR-3775:
-------------------------------------

Hi,
as the exception suggests, this issue has nothing to do with Apache Solr, i is caused by the
libraray called Apache TIKA that is bundled with extracting module to do the file parsing
stuff. We cannot fix this issue, it would be better to report this to the [TIKA|https://issues.apache.org/jira/browse/TIKA]
project. It would be also good to attach the .doc file causing this to their issue.
                
> Unexpected RuntimeException
> ---------------------------
>
>                 Key: SOLR-3775
>                 URL: https://issues.apache.org/jira/browse/SOLR-3775
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0-BETA
>            Reporter: Alex C
>
> Hi. I'm using Solr 4.0 Beta (no modifications to default installation) to index, and
it's blowing up on Word *.DOC files:
> {code}curl
> "http://localhost:8983/solr/update/extract?literal.id=doc15&commit=true" -F "myfile=@15.doc"{code}
> Here's the exception. And the same files go through Solr 3.6.1 just fine.
> {noformat}    <?xml version="1.0" encoding="UTF-8"?>
>     <response>
>     <lst name="responseHeader"><int name="status">500</int><int
name="QTime">18</int
>     ></lst><lst name="error"><str
> name="msg">org.apache.tika.exception.TikaException
>     : Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser
>     @328c62ce</str><str name="trace">org.apache.solr.common.SolrException:

> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:230)
>             at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
>     ntentStreamHandlerBase.java:74)
>             at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
>     erBase.java:129)
>             at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
>     Request(RequestHandlers.java:240)
>             at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
>     .java:454)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>     r.java:275)
>             at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
>     Handler.java:1337)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
>     :484)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:119)
>             at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
>     er.java:233)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
>     er.java:1065)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
>     413)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
>     r.java:192)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
>     r.java:999)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:117)
>             at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
>     extHandlerCollection.java:250)
>             at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
>     ection.java:149)
>             at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
>     .java:111)
>             at org.eclipse.jetty.server.Server.handle(Server.java:351)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac
>     tHttpConnection.java:454)
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(Blockin
>     gHttpConnection.java:47)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(Abstra
>     ctHttpConnection.java:890)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header
>     Complete(AbstractHttpConnection.java:944)
>             at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:642)
>             at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpCo
>     nnection.java:66)
>             at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(So
>     cketConnector.java:254)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
>     l.java:599)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
>     .java:534)
>             at java.lang.Thread.run(Unknown Source)
>     Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException
>     from org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244
>     )
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
>     20)
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:224)
>             ... 31 more
>     Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
>             at
> org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:163)
>             at
> org.apache.poi.hwpf.model.Colorref.&lt;init&gt;(Colorref.java:81)
>             at
> org.apache.poi.hwpf.model.types.SHDAbstractType.fillFields(SHDAbstrac
>     tType.java:56)
>             at
> org.apache.poi.hwpf.usermodel.ShadingDescriptor.&lt;init&gt;(ShadingD
>     escriptor.java:38)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.unCompressCHPOpera
>     tion(CharacterSprmUncompressor.java:582)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(Char
>     acterSprmUncompressor.java:65)
>             at
> org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:288)
>             at
> org.apache.poi.hwpf.model.StyleSheet.&lt;init&gt;(StyleSheet.java:121
>     )
>             at
> org.apache.poi.hwpf.HWPFDocument.&lt;init&gt;(HWPFDocument.java:346)
>             at
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.ja
>     va:77)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :185)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :160)
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             ... 34 more
>     </str><int name="code">500</int></lst>
>     </response>{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message