poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 59021] New: XSSFSheetXMLHandler is using qName instead of localName and missing cells/rows
Date Wed, 17 Feb 2016 20:46:16 GMT
https://bz.apache.org/bugzilla/show_bug.cgi?id=59021

            Bug ID: 59021
           Summary: XSSFSheetXMLHandler is using qName instead of
                    localName and missing cells/rows
           Product: POI
           Version: unspecified
          Hardware: PC
            Status: NEW
          Severity: critical
          Priority: P2
         Component: XSSF
          Assignee: dev@poi.apache.org
          Reporter: tallison@mitre.org

On TIKA-1859, Movses raised an issue that he can extract content with POI from
a specific xlsx file but not from Tika.

I confirmed that the content is extractable with XSSFWorkbook.

However, Tika does a streaming read with XSSFSheetXMLHandler. 
XSSFSheetXMLHandler relies on qName to find "row" and "c".  In the submitted
problematic file, the qName includes the namespace (i.e. "x:row", "x:c") and
the sheet handler entirely skips that content.

When I switched the string processing in startElement and endElement in
XSSFSheetXMLHandler to rely on localName, instead of qName, content was
correctly extracted.

Movses ranked this a blocker on Tika.  It would be great if we could get the
fix in before we cut 3.14...  I should have time tonight so make the fix in
trunk.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message