pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davide Zoni <Davide.Z...@Cedacri.it>
Subject RE: Check for scripts in a PDF
Date Mon, 29 Aug 2016 09:09:20 GMT
Hi everybody again,

i'm trying to figure out if your method is suitable for my necessities but everytime i try
to access the acroForm (even in a pdf file with scripts and forms) it's null.
Am i loading the file in a wrong way? Am i missing something?

Best regards.
        
________________________________________
Da: Tilman Hausherr [THausherr@t-online.de]
Inviato: mercoledì 24 agosto 2016 18.24
A: users@pdfbox.apache.org
Oggetto: Re: Check for scripts in a PDF

Am 24.08.2016 um 15:41 schrieb Davide Zoni:
> Thank you. This might be helpful but i'm afraid that i would not be able to check every
possibility. There's a way to check if a PDF is static (or dynamic)? For our pourpose that
shuold be enough.

No there is no such method.

Tilman


> Best regards.
>
>          Davide Zoni
>
>          Cedacri S.p.A.
>
>          Tel.: 0521807433
>
>          e-mail: davide.zoni@cedacri.it
>
>          www.cedacri.it
>
>
> ________________________________________
> Da: Tilman Hausherr [THausherr@t-online.de]
> Inviato: martedì 23 agosto 2016 18.23
> A: users@pdfbox.apache.org
> Oggetto: Re: Check for scripts in a PDF
>
> Am 23.08.2016 um 09:35 schrieb Davide Zoni:
>> Yes, i'm seeking to detect files with scripts. Not static. I don't undestand what
do you mean with "Maybe compare
>> with the preflight source code to check that you didn't miss something", can you
elaborate on that?
> I meant to search for "Javascript" in the source code, and then see
> where it is used. This is just so that you can be more sure what you got
> all when you read the PDF specification.
>
> Btw I once wrote some code to show (some) javascript fields, see below
> or search for "Roberto Nibali Javascript". He also improved that code
> and posted the improved version. It may not find all javascript stuff,
> but it could help show you how to write code.
>
> Tilman
>
>
> public class PrintJavaScriptFields
> {
>
>       /**
>        * This will print all the fields from the document.
>        *
>        * @param pdfDocument The PDF to get the fields from.
>        *
>        * @throws IOException If there is an error getting the fields.
>        */
>       public void printFields(PDDocument pdfDocument) throws IOException
>       {
>           PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
>           PDAcroForm acroForm = docCatalog.getAcroForm();
>           List<PDField> fields = acroForm.getFields();
>
>           //System.out.println(fields.size() + " top-level fields were
> found on the form");
>
>           for (PDField field : fields)
>           {
>               processField(field, "|--", field.getPartialName());
>           }
>       }
>
>       private void processField(PDField field, String sLevel, String
> sParent) throws IOException
>       {
>           String partialName = field.getPartialName();
>
>           if (field instanceof PDTerminalField)
>           {
>               PDTerminalField termField = (PDTerminalField) field;
>               for (PDAnnotationWidget widget : termField.getWidgets())
>               {
>                   PDAction action = widget.getAction();
>                   if (action instanceof PDActionJavaScript)
>                   {
>                       System.out.println(field.getFullyQualifiedName() +
> ": " + action.getClass().getSimpleName() + " js widget action:\n" +
> action.getCOSObject());
>                       printPossibleJS(action);
>                   }
>                   PDAnnotationAdditionalActions actions =
> widget.getActions();
>                   if (actions != null)
>                   {
>                       System.out.println(field.getFullyQualifiedName() +
> ": " + actions.getClass().getSimpleName() + " js widget actionS:\n" +
> actions.getCOSObject());
>
>                       // Merkwürdig, wieso bekomme ich nicht
> PDFormFieldAdditionalActions sondern ein PDAnnotationAdditionalActions
> in dem ein K ist aber kein getK() ?
>                       PDFormFieldAdditionalActions ffActions = new
> PDFormFieldAdditionalActions((COSDictionary) actions.getCOSObject());
>                       printPossibleJS(ffActions.getK());
>                       printPossibleJS(ffActions.getC());
>                       printPossibleJS(ffActions.getF());
>                       printPossibleJS(ffActions.getV());
>                   }
>               }
>           }
>
>           if (field instanceof PDNonTerminalField)
>           {
>               if (!sParent.equals(field.getPartialName()))
>               {
>                   if (partialName != null)
>                   {
>                       sParent = sParent + "." + partialName;
>                   }
>               }
>               //System.out.println(sLevel + sParent);
>
>               for (PDField child : ((PDNonTerminalField)
> field).getChildren())
>               {
>                   processField(child, "|  " + sLevel, sParent);
>               }
>           }
>           else
>           {
>               String fieldValue = field.getValueAsString();
>               StringBuilder outputString = new StringBuilder(sLevel);
>               outputString.append(sParent);
>               if (partialName != null)
>               {
>                   outputString.append(".").append(partialName);
>               }
>               outputString.append(" = ").append(fieldValue);
>               outputString.append(",
> type=").append(field.getClass().getName());
>               //System.out.println(outputString);
>           }
>       }
>
>       private void printPossibleJS(PDAction kAction)
>       {
>           if (kAction instanceof PDActionJavaScript)
>           {
>               PDActionJavaScript jsAction = (PDActionJavaScript) kAction;
>               String jsString = jsAction.getAction();
>               if (!jsString.contains("\n"))
>               {
>                   // Sonst erscheint in Netbeans nichts?!
>                   jsString = jsString.replaceAll("\r",
> "\n").replaceAll("\n\n", "\n");
>               }
>               System.out.println(jsString);
>               System.out.println();
>           }
>       }
>
>       /**
>        * This will read a PDF file and print out the form elements. <br />
>        * see usage() for commandline
>        *
>        * @param args command line arguments
>        *
>        * @throws IOException If there is an error importing the FDF document.
>        */
>       public static void main(String[] args) throws IOException
>       {
>           PDDocument pdf = null;
>           try
>           {
>               pdf = PDDocument.load(new File(XXXXXX));
>               PrintJavaScriptFields exporter = new PrintJavaScriptFields();
>               exporter.printFields(pdf);
>           }
>           finally
>           {
>               if (pdf != null)
>               {
>                   pdf.close();
>               }
>           }
>       }
>
> }
>
>
>
>> Thank you.
>>
>>           Davide
>>
>> ________________________________________
>> Da: Tilman Hausherr [THausherr@t-online.de]
>> Inviato: martedì 23 agosto 2016 8.34
>> A: users@pdfbox.apache.org
>> Oggetto: Re: Check for scripts in a PDF
>>
>> Am 22.08.2016 um 15:14 schrieb Davide Zoni:
>>> Hallo everybody,
>>>
>>> i'm using PDFbox to check if a PDF file contains malicious scripts. I'm using
the PDF/A-1a validation to check the file. Since i'm searching only for potential damaging
code and not for a true PDF/A-1a standard accompliance, is it enough to consider 1.x.x, 6.x.x
and 7.x.x errors as "true" errors? Below category description:
>>>
>>> Category        Description
>>> 1[.y[.z]]       Syntax Error
>>> 2[.y[.z]]       Graphic Error
>>> 3[.y[.z]]       Font Error
>>> 4[.y[.z]]       Transparency Error
>>> 5[.y[.z]]       Annotation Error
>>> 6[.y[.z]]       Action Error
>>> 7[.y[.z]]       Metadata Error
>> Unclear what you're asking. Are you seeking to detect files with
>> javascript? If so, I'd rather build something something from scratch,
>> i.e. read the PDF specification and see where JS is used. Maybe compare
>> with the preflight source code to check that you didn't miss something.
>>
>> Tilman
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>> Il contenuto e le informazioni di questo messaggio di posta elettronica sono riservate,
confidenziali e non vincolanti nè impegnative per Cedacri s.p.a., ne è vietata pertanto
la diffusione o divulgazione in qualunque modo eseguita. Qualora Lei non fosse la persona
a cui il presente messaggio è destinato La invitiamo ad eliminarlo e a non leggerlo, dandocene
gentilmente comunicazione. The content, informations and any attachments of this e-mail are
classified, confidential and not binding neither impegnative for Cedacri S.P.A., the spread
or spreading in any executed way is prohibited therefore. If you are not named recipient,
please notify the sender immediately and do not disclose the contents to another person, use
it for any purpose, or store or copy the information in any medium.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message