Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A039917B53 for ; Fri, 20 Mar 2015 15:05:43 +0000 (UTC) Received: (qmail 48512 invoked by uid 500); 20 Mar 2015 15:05:09 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 48488 invoked by uid 500); 20 Mar 2015 15:05:09 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 48477 invoked by uid 99); 20 Mar 2015 15:05:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Mar 2015 15:05:08 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [194.25.134.84] (HELO mailout09.t-online.de) (194.25.134.84) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Mar 2015 15:05:02 +0000 Received: from fwd04.aul.t-online.de (fwd04.aul.t-online.de [172.20.26.149]) by mailout09.t-online.de (Postfix) with SMTP id 72AEB581B8C for ; Fri, 20 Mar 2015 16:04:11 +0100 (CET) Received: from [192.168.2.102] (GQ1zO8ZpwhjH73AiZTmWunp5fr9p1eY9ICqsARfZpvfzWSZQ3ziPS9Kp3q-havtwT5@[217.231.140.77]) by fwd04.t-online.de with (TLSv1.2:ECDHE-RSA-AES256-SHA encrypted) esmtp id 1YYyT1-2Op25Q0; Fri, 20 Mar 2015 16:04:07 +0100 Message-ID: <550C370A.1000407@t-online.de> Date: Fri, 20 Mar 2015 16:04:42 +0100 From: Tilman Hausherr User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: users@pdfbox.apache.org Subject: Re: Looking for some guidance on using PDFBox to analyze page content References: In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-ID: GQ1zO8ZpwhjH73AiZTmWunp5fr9p1eY9ICqsARfZpvfzWSZQ3ziPS9Kp3q-havtwT5 X-TOI-MSGID: 28de05e4-71d8-4676-9f17-b14c8ac8d009 X-Virus-Checked: Checked by ClamAV on apache.org Yes, by analysing the content stream operators (e.g. "l", "c"), but you will have the problem that e.g. an underlined text is a drawed font (which technically is also vector graphics) and a line. And you won't be able to tell easily that this line is related to the font. Tilman Am 20.03.2015 um 14:43 schrieb Warren Gallagher: > > > Greetings, > > Is there a means to determine if a page contains: > > * vector graphics > * raster graphics (and what format) > > Regards, > > WARREN GALLAGHER - CTO > > warren.gallagher@apxconsult.com > > M: 613-791-4987 W: 613-262-2601 Advance Property eXposure Canada Inc. > 1755 Woodward Drive, Suite 101, Ottawa, Ontario K2C 0P9 APXConsult.com > [1] > > Links: > ------ > [1] http://apxconsult.com > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org