poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yegor Kozlov <yegor.koz...@dinom.ru>
Subject Re: Bug 53380
Date Mon, 10 Sep 2012 07:16:10 GMT
We have all pre-requisites for fixing this bug, just need to find a
person to do it :)

POI is a volunteer project and if this problem is important for you,
please do work on it and submit a patch. Otherwise please wait.
Unfortuntaly we don't have a active developer working on DOC/DOCX
modules, so fixing may take some time.

Yegor

On Mon, Sep 10, 2012 at 9:48 AM, Alex Cougarman <acougarm@bwc.org> wrote:
> Hi. I'm having the same issue from this bug with hundreds of our DOC files being fed
through Solr/Tika: https://issues.apache.org/bugzilla/show_bug.cgi?id=53380
>
> I downloaded the DOC file attached to the ticket and was able to generate the same error
we've been getting (please see below for the exception).
>
> Anyone know of a solution/workaround? Is there a timeline for a fix? I commented and
voted on the ticket but not sure if it's a priority. Thanks.
>
> org.apache.tika.exception.TikaException
>     : Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser@328c62ce
> org.apache.solr.common.SolrException:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:230)
>             at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
>     ntentStreamHandlerBase.java:74)
>             at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
>     erBase.java:129)
>             at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
>     Request(RequestHandlers.java:240)
>             at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
>     .java:454)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>     r.java:275)
>             at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
>     Handler.java:1337)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
>     :484)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:119)
>             at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
>     er.java:233)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
>     er.java:1065)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
>     413)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
>     r.java:192)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
>     r.java:999)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:117)
>             at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
>     extHandlerCollection.java:250)
>             at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
>     ection.java:149)
>            at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
>     .java:111)
>             at org.eclipse.jetty.server.Server.handle(Server.java:351)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac
>    tHttpConnection.java:454)
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(Blockin
>     gHttpConnection.java:47)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(Abstra
>     ctHttpConnection.java:890)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header
>     Complete(AbstractHttpConnection.java:944)
>             at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:642)
>             at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
>
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpCo
>     nnection.java:66)
>             at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(So
>     cketConnector.java:254)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
>     l.java:599)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
>     .java:534)
>             at java.lang.Thread.run(Unknown Source)
>     Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException
>     from org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244
>     )
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
>     20)
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:224)
>             ... 31 more
>     Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
>             at
> org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:163)
>             at
> org.apache.poi.hwpf.model.Colorref.&lt;init&gt;(Colorref.java:81)
>             at
> org.apache.poi.hwpf.model.types.SHDAbstractType.fillFields(SHDAbstrac
>     tType.java:56)
>             at
> org.apache.poi.hwpf.usermodel.ShadingDescriptor.&lt;init&gt;(ShadingD
>     escriptor.java:38)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.unCompressCHPOpera
>     tion(CharacterSprmUncompressor.java:582)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(Char
>     acterSprmUncompressor.java:65)
>             at
> org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:288)
>             at
> org.apache.poi.hwpf.model.StyleSheet.&lt;init&gt;(StyleSheet.java:121
>     )
>             at
> org.apache.poi.hwpf.HWPFDocument.&lt;init&gt;(HWPFDocument.java:346)
>             at
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.ja
>     va:77)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :185)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :160)
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             ... 34 more
>
>
> Warm regards,
> Alex
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Mime
View raw message