poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 52863] java.lang.ArrayIndexOutOfBoundsException in org.apache.poi.hwpf.sprm.SprmOperation.initSize
Date Sat, 10 Mar 2012 01:33:39 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=52863

--- Comment #4 from HarrySimons <simonsharry@gmail.com> 2012-03-10 01:33:39 UTC ---
> Do you know the origin of these failing
> docs? Were they created by MS Word or
> by OpenOffice or by what ? 

They were created by a post-2003 and pre-2007 version of MS Word. 


> Without a sample file we can't do much.

Just the name itself of the document is 'Business Intelligence', so you can
imagine my difficulty. Even other documents that failing are sensitive enough.
I thought, I should be able to remove the sensitive parts of this document and
then upload it for the Tika/POI developers. But even mere re-saving the
document in Word 2007 (i.e., without any new edits whatsoever) makes the
problem mostly go away. I say 'mostly' because, while Tika/POI are then able to
extract the text, they also append text like this to the output

_-1388201556/ole-[42, 4D, 0E, 0A, 00, 00, 00, 00]

_-1388203796/ole-[42, 4D, 2E, 0A, 00, 00, 00, 00]

_-1388843352/ole-[42, 4D, 2E, 0A, 00, 00, 00, 00]

_-1388845272/ole-[42, 4D, BA, 09, 00, 00, 00, 00]

_-1388297360/ole-[42, 4D, BA, 09, 00, 00, 00, 00]

_-1388297680/ole-[42, 4D, D6, 09, 00, 00, 00, 00]

_-1388296720/ole-[42, 4D, BA, 09, 00, 00, 00, 00]

_-1388203476/ole-[42, 4D, 66, 09, 00, 00, 00, 00]

_-1382869532/ole-[42, 4D, 36, 0C, 00, 00, 00, 00]

_-1388200596/ole-[42, 4D, 2E, 0A, 00, 00, 00, 00]

_-1388200916/ole-[42, 4D, BA, 09, 00, 00, 00, 00]

_-1383036196/ole-[42, 4D, 12, 09, 00, 00, 00, 00]

_-1382867932/ole-[42, 4D, 86, 0A, 00, 00, 00, 00]

_-1382868252/ole-[42, 4D, 2E, 0A, 00, 00, 00, 00]

_-1380808936/ole-[42, 4D, 2E, 0A, 00, 00, 00, 00]


Being a developer myself, I am fully aware how hard it can be to fix (certain)
bugs without appropriate test input. I will watch out for newer releases.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message