Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 180A71015D for ; Tue, 16 Apr 2013 09:40:20 +0000 (UTC) Received: (qmail 35830 invoked by uid 500); 16 Apr 2013 09:40:19 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 35812 invoked by uid 500); 16 Apr 2013 09:40:19 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 35797 invoked by uid 99); 16 Apr 2013 09:40:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Apr 2013 09:40:19 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [213.133.104.168] (HELO www168.your-server.de) (213.133.104.168) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Apr 2013 09:40:12 +0000 Received: from [78.46.5.204] (helo=sslproxy02.your-server.de) by www168.your-server.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.74) (envelope-from ) id 1US2Mg-0007bS-QA for users@pdfbox.apache.org; Tue, 16 Apr 2013 11:39:50 +0200 Received: from [93.207.108.209] (helo=[192.168.178.34]) by sslproxy02.your-server.de with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1US2Md-000097-BK for users@pdfbox.apache.org; Tue, 16 Apr 2013 11:39:47 +0200 From: Maruan Sahyoun Content-Type: multipart/alternative; boundary="Apple-Mail=_A34769DB-A370-4474-8409-59D1F041953D" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\)) Subject: Re: Extract text using pdfbox Date: Tue, 16 Apr 2013 11:39:43 +0200 References: To: users@pdfbox.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1503) X-Authenticated-Sender: sahyoun@fileaffairs.de X-Virus-Scanned: Clear (ClamAV 0.97.6/17019/Tue Apr 16 06:41:21 2013) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_A34769DB-A370-4474-8409-59D1F041953D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Hi Rahul, PDF is a binary format and readable text which is visible in a single = line could be organized in various pieces within a PDF. I think the = easiest option for you might be to use the ExtractText command line tool = as a start and review the result = http://pdfbox.apache.org/commandlineutilities/ExtractText.html. Use the = sort option to arrange the text sorted by it's position. BR Maruan Sahyoun Am 16.04.2013 um 11:35 schrieb rahul bhalla : > hi > Actually i search various site and read different forum but not able = to > find a way to read a single line from specific page number and also = want to > extract its property of that line. > Is there is any way to read pdf by using readLine() method of = bufferReader > or some other way > Please suggest me > --=20 > Regards > Rahul Bhalla --Apple-Mail=_A34769DB-A370-4474-8409-59D1F041953D--