pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cem Dayanik (Ibtech-Software Infrastructure)" <cem.daya...@ibtech.com.tr>
Subject RE: fake lines
Date Tue, 17 Jan 2017 09:19:40 GMT
Original file(one page - removed texts):
 4AOnePageWithoutText.pdf

Lines.pdf is generated with parsed line infos.

https://we.tl/M0M5eM4weR

This is the set of lines I am trying to get rid of

VERTICALLINE:Point2D.Double[142.0, 116.0] Point2D.Double[142.0, 276.0]





-----Original Message-----
From: Cem Dayanik (Ibtech-Software Infrastructure)
Sent: Tuesday, January 17, 2017 11:06 AM
To: users@pdfbox.apache.org
Subject: RE: fake lines

Stroking color is same for every line.
Graphics.color/bg color comparison didnt work.

I tried something like discardLines4Rectangele(p0, p1, p2, p3) at the start of appendrectangle.
Maybe some rectangle overrides it, but no luck.

I am not using linepath directly, this is what it looks like.


        @Override
        public void appendRectangle(Point2D p0, Point2D p1, Point2D p2, Point2D p3) {
                //discardLines4Rectangele(p0, p1, p2, p3);
                this.lineList.add(new LineInfo(new Point2D.Double(p0.getX(), p0.getY()),new
Point2D.Double(p1.getX(), p1.getY()), this.getGraphicsState().getLineWidth()));
                this.lineList.add(new LineInfo(new Point2D.Double(p1.getX(), p1.getY()),new
Point2D.Double(p2.getX(), p2.getY()), this.getGraphicsState().getLineWidth()));
                this.lineList.add(new LineInfo(new Point2D.Double(p2.getX(), p2.getY()),new
Point2D.Double(p3.getX(), p3.getY()), this.getGraphicsState().getLineWidth()));
                this.lineList.add(new LineInfo(new Point2D.Double(p3.getX(), p3.getY()),new
Point2D.Double(p0.getX(), p0.getY()), this.getGraphicsState().getLineWidth()));
                super.appendRectangle(p0, p1, p2, p3);
        }


        @Override
        public void lineTo(float x, float y) {
                Point2D currentPoint = this.getCurrentPoint();
                //Graphics2D graphics = this.getGraphics();
                //PDGraphicsState state = this.getGraphicsState();
                this.lineList.add(new LineInfo(new Point2D.Double(currentPoint.getX(), currentPoint.getY()),
new Point2D.Double((double)x, (double)y), this.getGraphicsState().getLineWidth()));
                super.lineTo(x, y);
        }



Any other idea?
Is it possible to debug where those lines are "ignored" while rendering?
They are buffered right? The actual rendering is happening all of this stuff done?

Not: cant find any working upload server atm (annoying company policy)




-----Original Message-----
From: Tilman Hausherr [mailto:THausherr@t-online.de]
Sent: Tuesday, January 17, 2017 10:03 AM
To: users@pdfbox.apache.org
Subject: Re: fake lines

Attachments usually don't go through, you'd have to upload them somewhere.

In PageDrawer.java, the lines are in "linePath". For stroke it's the stroking color.

Tilman

Am 17.01.2017 um 07:45 schrieb Cem Dayanik (Ibtech-Software Infrastructure):
>
> Hello everyone,
>
> I need to extract table data from pdf.
>
> I know there are different approaches for that, but the table has
> “gridlines”, so i needed an exact solution.
>
> My problem is that, when I parse the pdf with a page drawer, there are
> some lines that actually not “seen in pdf”.
>
> I need to discard them but I couldnt find how to.
>
> Obviously there is a hidden information in “graphics/graphicsstate”.
>
> (not width, background/foreground color)
>
> Please see attachments for clarification.
>
> Any help would be appreciated.
>
> Thanks.
>
> These are not “raw” lines, these are “combined” line info. Bold one I
> need to get rid of (actually a set of lines, not single)
>
> HORIZONTALLINE:Point2D.Double[31.0, 276.0] Point2D.Double[565.0,
> 276.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 311.0] Point2D.Double[565.0,
> 311.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 256.0] Point2D.Double[565.0,
> 256.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 236.0] Point2D.Double[565.0,
> 236.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 216.0] Point2D.Double[565.0,
> 216.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 196.0] Point2D.Double[565.0,
> 196.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 176.0] Point2D.Double[565.0,
> 176.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 156.0] Point2D.Double[565.0,
> 156.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 136.0] Point2D.Double[565.0,
> 136.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 116.0] Point2D.Double[565.0,
> 116.0]
>
> HORIZONTALLINE:Point2D.Double[31.0, 108.5] Point2D.Double[564.0,
> 108.5]
>
> VERTICALLINE:Point2D.Double[565.0, 116.0] Point2D.Double[565.0, 311.0]
>
> VERTICALLINE:Point2D.Double[31.0, 116.0] Point2D.Double[31.0, 311.0]
>
> VERTICALLINE:Point2D.Double[51.0, 116.0] Point2D.Double[51.0, 311.0]
>
> VERTICALLINE:Point2D.Double[95.0, 116.0] Point2D.Double[95.0, 311.0]
>
> VERTICALLINE:Point2D.Double[222.0, 116.0] Point2D.Double[222.0, 311.0]
>
> *VERTICALLINE:Point2D.Double[142.0, 116.0] Point2D.Double[142.0,
> 276.0]***
>
> VERTICALLINE:Point2D.Double[247.0, 116.0] Point2D.Double[247.0, 311.0]
>
> VERTICALLINE:Point2D.Double[287.0, 116.0] Point2D.Double[287.0, 311.0]
>
> VERTICALLINE:Point2D.Double[310.0, 116.0] Point2D.Double[310.0, 311.0]
>
> VERTICALLINE:Point2D.Double[339.0, 116.0] Point2D.Double[339.0, 311.0]
>
> VERTICALLINE:Point2D.Double[369.0, 116.0] Point2D.Double[369.0, 311.0]
>
> VERTICALLINE:Point2D.Double[402.0, 116.0] Point2D.Double[402.0, 311.0]
>
> VERTICALLINE:Point2D.Double[452.0, 116.0] Point2D.Double[452.0, 311.0]
>
> VERTICALLINE:Point2D.Double[432.0, 116.0] Point2D.Double[432.0, 311.0]
>
> VERTICALLINE:Point2D.Double[507.0, 116.0] Point2D.Double[507.0, 311.0]
>
> VERTICALLINE:Point2D.Double[537.0, 116.0] Point2D.Double[537.0, 311.0]
>
> VERTICALLINE:Point2D.Double[147.0, 116.0] Point2D.Double[147.0, 311.0]
>
>
>
>
>
> Bu e-posta'nın içerdiği bilgiler (ekleri dahil olmak üzere) gizlidir.
> Onayımız olmaksızın üçüncü kişilere açiklanamaz. Bu mesajın
> gönderilmek istendiği kişi değilseniz, lütfen mesajı sisteminizden
> derhal siliniz. IBTech A.Ş. bu mesajın içerdiği bilgilerin doğruluğu
> veya eksiksiz olduğu konusunda bir garanti vermemektedir. Bu nedenle
> bilgilerin ne şekilde olursa olsun içeriğinden, iletilmesinden,
> alınmasından, saklanmasından sorumlu değildir. Bu mesajın içeriği
> yazarına ait olup, IBTech A.Ş.'nin görüşlerini içermeyebilir.
>
> The information contained in this e-mail (including any attachments)is
> confidential. It must not be disclosed to any person without our
> authority. If you are not the intended recipient, please delete it
> from your system immediately. IBTech A.S. makes no warranty as to the
> accuracy or completeness of any information contained in this message
> and hereby excludes any liability of any kind for the information
> contained therein or for the information transmission, reception,
> storage or use of such in any way whatsoever. Any opinions expressed
> in this message are those of the author and may not necessarily
> reflect the opinions of IBTech A.S.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org




Bu e-posta'nin içerdigi bilgiler (ekleri dahil olmak
üzere) gizlidir. Onayimiz olmaksizin üçüncü kisilere açiklanamaz. Bu mesajin
gönderilmek istendigi kisi degilseniz, lütfen mesaji sisteminizden derhal
siliniz. IBTech A.S. bu mesajin içerdigi bilgilerin dogrulugu veya eksiksiz
oldugu konusunda bir garanti vermemektedir. Bu nedenle bilgilerin ne sekilde
olursa olsun içeriginden, iletilmesinden, alinmasindan, saklanmasindan sorumlu
degildir. Bu mesajin içerigi yazarina ait olup, IBTech A.S.'nin görüslerini
içermeyebilir.

The information contained in this e-mail (including any
attachments)is confidential. It must not be disclosed to any person without our
authority. If you are not the intended recipient, please delete it from your
system immediately. IBTech A.S. makes no warranty as to the accuracy or
completeness of any information contained in this message and hereby excludes
any liability of any kind for the information contained therein or for the
information transmission, reception, storage or use of such in any way
whatsoever. Any opinions expressed in this message are those of the author and
may not necessarily reflect the opinions of IBTech
A.S.
Mime
View raw message