pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul bhalla <urcoolfrien...@gmail.com>
Subject Re: Extract text using pdfbox
Date Tue, 16 Apr 2013 13:53:53 GMT
hello Vladimir

sir read the code  but unfortunately i am unable to understand the code

PDFStreamParser parser = new PDFStreamParser(contents.getStream() );
                parser.parse();
                List tokens = parser.getTokens();
                for( int j=0; j<tokens.size(); j++ )
                {
                    Object next = tokens.get( j );
                    if( next instanceof PDFOperator )
                    {
                        PDFOperator op = (PDFOperator)next;
                        //Tj and TJ are the two operators that display
                        //strings in a PDF
                        if( op.getOperation().equals( "Tj" ) )
                        {
                            //Tj takes one operator and that is the string
                            //to display so lets update that operator
                            COSString previous = (COSString)tokens.get( j-1 );
                            String string = previous.getString();
                            string = string.replaceFirst( strToFind, message );
                            previous.reset();
                            previous.append( string.getBytes("ISO-8859-1") );
                        }
                        else if( op.getOperation().equals( "TJ" ) )
                        {
                            COSArray previous = (COSArray)tokens.get( j-1 );
                            for( int k=0; k<previous.size(); k++ )
                            {
                                Object arrElement = previous.getObject( k );
                                if( arrElement instanceof COSString )
                                {
                                    COSString cosString = (COSString)arrElement;
                                    String string = cosString.getString();
                                    string = string.replaceFirst(
strToFind, message );
                                    cosString.reset();
                                    cosString.append(
string.getBytes("ISO-8859-1") );
                                }
                            }
                        }





On Tue, Apr 16, 2013 at 3:11 PM, Vladimir Starostenkov <
vladimir.starostenkov@gmail.com> wrote:

> Have you tried to look through
>
> http://svn.apache.org/repos/asf/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/ReplaceString.java
>



-- 
Regards
Rahul Bhalla

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message