Many thanks to Maruan Sahyoun for his great work on PDFBox + XFA.
This code only works when you remove all security on the PDDocument.
It also assumes the COS object in PDXFA is a COSStream. The simplistic example below reads
the xml stream and writes it back into the PDF.
PDDocument doc = PDDocument.load("filename");
doc.setAllSecurityToBeRemoved(true);
PDDocumentCatalog docCatalog = doc.getDocumentCatalog();
PDAcroForm form = docCatalog.getAcroForm();
PDXFA xfa = form.getXFA();
COSBase cos = xfa.getCOSObject();
COSStream coss = (COSStream) cos;
InputStream cosin = coss.getUnfilteredStream();
Document document = documentBuilder.parse(cosin);
COSStream cosout = new COSStream(new RandomAccessBuffer());
OutputStream out = cosout.createUnfilteredStream();
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
DOMSource source = new DOMSource(xmlDoc);
StreamResult result = new StreamResult(out);
transformer.transform(source, result);
PDXFA xfaout = new PDXFA(cosout);
form.setXFA(xfaout);
|