pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilad Denneboom <gilad.denneb...@gmail.com>
Subject Re: How to Split PDF based on contents inside
Date Thu, 19 Feb 2015 11:28:53 GMT
AFAIK, there's no easy, out-of-the-box way of that doing that with PDFBox.
You would need to develop your own code to identify the text you're after
and then extract the pages that are associated with it as new files. The
way to do that would depend a lot on how the files are set up.

I've developed various such tools in the past for my customers, so if
you're interested in someone developing it for you feel free to contact me
privately.

On Wed, Feb 18, 2015 at 7:05 PM, <Ganesh.Yadav@sungard.com> wrote:

> I have 1 GB long in size PDF document. There are plenty of defect / issue
> data in this PDF. Every defect has number. I want to break down this large
> PDF into multiple smaller PDFs so that I can have one pdf for one separate
> defect ID.
>
> I am looking for functionality that will allow me to pass a search string
> to look for inside large PDF document.
> And to break it down based on start and end of this defect ID.
>
> Can someone please suggest how this can be achieved using PDFBox?
>
> Thanks
> Ganesh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message