poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 60973] New: can't parse some vsdx files
Date Wed, 12 Apr 2017 07:03:16 GMT
https://bz.apache.org/bugzilla/show_bug.cgi?id=60973

            Bug ID: 60973
           Summary: can't parse some vsdx files
           Product: POI
           Version: unspecified
          Hardware: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: XDGF
          Assignee: dev@poi.apache.org
          Reporter: gytmkc@gmail.com
  Target Milestone: ---

Hi,

1. we're using single core Solr 6.4 instance on windows server (windows server
2012 R2 standard)
2. Java v8, (build 1.8.0_121-b13)
3. ooxml-schemas-1.3.jar, poi-3.15.jar, poi-ooxml-3.15.jar,
poi-scratchpad-3.15.jar

But still we have some solrexeptions/errors for ~2000 vsdx files.
It is critical to us have them indexed.

Any solutions from you are welcome.

for most of them I see this error/exception:


org.apache.poi.POIXMLException: Invalid 'Row_Type' name 'PolylineTo'


For example:


{
    "responseHeader": {
        "status": 500, 
        "QTime": 65
    }, 
    "error": {
        "msg": "org.apache.tika.exception.TikaException: Unexpected
RuntimeException from
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3c9f695c", 
        "code": 500, 
        "trace": "org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3c9f695c\r\n\tat
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:234)\r\n\tat
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)\r\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)\r\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)\r\n\tat
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)\r\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\r\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)\r\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\r\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\r\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\r\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\r\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:534)\r\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\r\n\tat
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\r\n\tat
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)\r\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)\r\n\tat
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\r\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\r\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\r\n\tat
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\r\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\r\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\r\n\tat
java.lang.Thread.run(Unknown Source)\r\nCaused by:
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@3c9f695c\r\n\tat
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)\r\n\tat
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)\r\n\tat
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)\r\n\tat
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)\r\n\t...
32 more\r\nCaused by: org.apache.poi.POIXMLException:
/visio/masters/masters.xml: /visio/masters/master11.xml: <Shape ID=\"11\">:
Invalid 'Row_Type' name 'PolylineTo'\r\n\tat
org.apache.poi.xdgf.exceptions.XDGFException.wrap(XDGFException.java:43)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFMasters.onDocumentRead(XDGFMasters.java:107)\r\n\tat
org.apache.poi.xdgf.usermodel.XmlVisioDocument.onDocumentRead(XmlVisioDocument.java:106)\r\n\tat
org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:190)\r\n\tat
org.apache.poi.xdgf.usermodel.XmlVisioDocument.<init>(XmlVisioDocument.java:79)\r\n\tat
org.apache.poi.xdgf.extractor.XDGFVisioExtractor.<init>(XDGFVisioExtractor.java:41)\r\n\tat
org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:207)\r\n\tat
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:86)\r\n\tat
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87)\r\n\tat
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)\r\n\t...
35 more\r\nCaused by: org.apache.poi.POIXMLException: Invalid 'Row_Type' name
'PolylineTo'\r\n\tat
org.apache.poi.xdgf.util.ObjectFactory.load(ObjectFactory.java:45)\r\n\tat
org.apache.poi.xdgf.usermodel.section.geometry.GeometryRowFactory.load(GeometryRowFactory.java:58)\r\n\tat
org.apache.poi.xdgf.usermodel.section.GeometrySection.<init>(GeometrySection.java:55)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFSheet.<init>(XDGFSheet.java:77)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFShape.<init>(XDGFShape.java:113)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFShape.<init>(XDGFShape.java:125)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFShape.<init>(XDGFShape.java:125)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFShape.<init>(XDGFShape.java:125)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFShape.<init>(XDGFShape.java:107)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFBaseContents.onDocumentRead(XDGFBaseContents.java:82)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFMasterContents.onDocumentRead(XDGFMasterContents.java:66)\r\n\tat
org.apache.poi.xdgf.usermodel.XDGFMasters.onDocumentRead(XDGFMasters.java:101)\r\n\t...
43 more\r\n", 
        "metadata": [
            "error-class", 
            "org.apache.solr.common.SolrException", 
            "root-error-class", 
            "org.apache.poi.POIXMLException"
        ]
    }
}

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message