pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: deleting a square from pdf file
Date Fri, 11 Jan 2019 18:03:16 GMT

You could get the content stream, and then search for something like this:

   1 0 0 1 564.094 785.197 cm
   0 0 m
   31.182 0 l
   31.182 -31.181 l
   0 -31.181 l

This is the same in all pages except for the 6th number.

Then you rewrite the content stream. See the RemoveAllTexts example for 
some inspiration on how to get and rewrite the tokens.

I used PDFDebugger to look at the content stream.


Am 11.01.2019 um 18:17 schrieb Michel Cozzolino:
> Hello,
> I’ve been using Pdfbox with full satisfaction since a couple of 
> months. Anyway, for the problem I’m facing now I can’t find a viable 
> solution so I’m asking for some help.
> I have to delete from the pdf file, a small black square that is on 
> the odd pages along the right border of the page (here is an 
> examplehttps://www.dropbox.com/s/tlny4qkiek5efb0/example.pdf?dl=0). I 
> got the square dimension, so can easily retrieve the x coordinate: the 
> problem is that I don’t have the y as it could be placed at any height 
> in the page. I thought to cover it by placing a withe rectangle as a 
> strip alongside the right border of the page, but this is not feasible 
> as some row text could end inside that zone.
> The rest of the page contains text and picture and that square is the 
> only “shape” in the page. Is there a way to get it or its coordinates?
> Many thanks
> Michele
> image.png

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message