poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject Invalid header for xls: 0x0010000000060409?
Date Mon, 24 Nov 2014 20:23:47 GMT
  I recently ran Tika against the ~1 million files in govdocs1.  Nearly 91% (2,579/2,828)
of the XLS exceptions via Tika 1.7 are the following.  Tika is detecting these as XLS and
then the header exception is thrown.
  Does this header ring any bells?  Old version of XLS, perhaps?  The triggering files open
in Excel and I think I see that they are "Excel 4".
  I can't get the link to work, but one triggering file is 004444.xls.



Caused by: java.io.IOException: Invalid header signature; read 0x0010000000060409, expected
0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:140)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:115) at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:198)
at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:184)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:162) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
... 13 more

To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

View raw message