pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanea Paul <psb77bl...@yahoo.com>
Subject The supplied password does not match either the owner or user password in the document.
Date Wed, 04 Dec 2013 14:42:44 GMT
Hello, 

I am trying to index a document in solr and for documents which have Page Extraction: Not
Allowed a CryptographyException exception is thrown. 

Is there some way to index the documents content or skipping it is the only option?

Thank you!

Stack trace from Tomcat logs:
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:129)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:472)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:498)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:411)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:326)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:234)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:382)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:448)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:429)
Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF content
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:83)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:153)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:127)
... 9 more
Caused by: org.apache.pdfbox.exceptions.WrappedIOException: Error decrypting document, details: 
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:327)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:66)
... 14 more
Caused by: org.apache.pdfbox.exceptions.CryptographyException: Error: The supplied password
does not match either the owner or user password in the document.
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.prepareForDecryption(StandardSecurityHandler.java:262)
at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:154)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1555)
at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:919)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:323)
... 15 more

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message