pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Muhammad Ismail <it.is.ism...@gmail.com>
Subject Re: Look for text in pdf file and then extract it
Date Fri, 11 Mar 2016 06:32:07 GMT
Try extract text from PDF & created search index from that text & do your
desired searching.

On Fri, Mar 11, 2016 at 10:37 AM, Tilman Hausherr <THausherr@t-online.de>
wrote:

> Am 11.03.2016 um 04:50 schrieb Najib Sahyoun:
>
>> ?Hello,
>>
>>
>> I am Najib Sahyoun, PhD student in accounting.
>>
>>
>> I am looking for an application that looks for a specific term (i.e.
>> board of directors) and then will extract all the sentences that include
>> the term (board of directors).
>>
>>
>> Does your application perform this?
>>
>
> No. You'll have to develop this on top of the text extraction or hire
> someone to do it. PDFBox just extract the text.
>
> Tilman
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
Thanks
Muhammad Ismail
cell (PAK) : +92.322.5100362
cell (Sweden): +46 700-321-521
e-mail: it.is.ismail@gmail.com

This message may contain confidential and/or privileged information.  If
you are not the addressee or authorized to receive this for the addressee,
you must not use, copy, disclose or take any action based on this message
or any information herein.  If you have received this message in error,
please advise the sender immediately by reply e-mail and delete this
message.  Thank you for your cooperation.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message