pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <j.tosov...@email.cz>
Subject Re: Extract embedded SVG image from PDF file
Date Thu, 07 Mar 2019 09:11:25 GMT

> We have access to the sources (Website), but this is time 
> consuming. Partly, there are web services, which we can use, but not for 
> all tasks. The PDF files are generated automatically by schedule, so this 
> way can be fully automated. 

Supposing your SVG data are available in some website and instead of 
downloading them one by one you prefer extract them in bulk from PDF 
snapshots of these pages, I'd recommend avoiding that PDF route and rather 
automating that SVG downloading step.

Firstly I'd ask the app developers to provide some API to get data via web 
service. Only if there is no other option, I would try guessing the SVG 
image URL for any page/article. If there is some relation, automation is 
easy. If not, you could somehow automate your manual steps via testing 
tools, see e.g. https://www.seleniumhq.org/.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message