pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilad Denneboom <gilad.denneb...@gmail.com>
Subject Re: Issue with PDF - Image conversion
Date Tue, 23 Jul 2013 12:02:34 GMT
I'm now encountering the same issue myself, ironically... Any ideas on
possible ways to solve this issue when the fonts are not fully embedded?


On Tue, Jun 18, 2013 at 5:03 PM, Gilad Denneboom
<gilad.denneboom@gmail.com>wrote:

> This is not related to PDFBox... It's about how you're generating the
> files (in InDesign, from the document properties).
>
>
> On Tue, Jun 18, 2013 at 4:50 PM, Robin Thomas Panicker <robin@qburst.com>wrote:
>
>> Thanks Gilad, can you please provide me some more insight on that... maybe
>> a code snippet or some reference or pointer or something?
>>
>> Regards,
>> Robin
>>
>>
>>
>> On Tue, Jun 18, 2013 at 6:10 PM, Gilad Denneboom
>> <gilad.denneboom@gmail.com>wrote:
>>
>> > Seems like it might be a fonts issue... Try embedding the full font
>> instead
>> > of just the subset when generating the file.
>> >
>> >
>> > On Tue, Jun 18, 2013 at 2:30 PM, Robin Thomas Panicker <
>> robin@qburst.com
>> > >wrote:
>> >
>> > > Sorry about that Gilad.
>> > > I have uploaded the same
>> > > here<https://www.dropbox.com/sh/ujrgmh47zku0zm9/h8z_4SR3Aw>
>> > >
>> > > Hope this helps,
>> > >
>> > > Thanks,
>> > > Robin
>> > >
>> > >
>> > >
>> > > On Tue, Jun 18, 2013 at 5:41 PM, Gilad Denneboom
>> > > <gilad.denneboom@gmail.com>wrote:
>> > >
>> > > > I'm not seeing any attachments... It's possible the mailing list
>> > doesn't
>> > > > allow them. You can upload them to some file-sharing site and post
>> the
>> > > > links here.
>> > > >
>> > > >
>> > > > On Tue, Jun 18, 2013 at 7:38 AM, Robin Thomas Panicker <
>> > robin@qburst.com
>> > > > >wrote:
>> > > >
>> > > > > Thanks a lot Gilad and Andreas,
>> > > > > I was out of town last week and hence could not reply.
>> > > > >
>> > > > > I have attached the sample PDF and the image generated (only
for
>> the
>> > > > first
>> > > > > page)
>> > > > >
>> > > > > If you notice the original pdf and the converted image,  the
words
>> > "The
>> > > > > pressures" and "The solution" is not coming correctly in the
>> > converted
>> > > > > image. The rest of the image looks fine.
>> > > > >
>> > > > > I have also attached a very very crude java code that does a
>> > standalone
>> > > > > task of converting this pdf into image.
>> > > > >
>> > > > > Can you please let me know what could be possibly causing the
>> image
>> > > > issue?
>> > > > >
>> > > > > Thanks,
>> > > > > Robin
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Tue, Jun 11, 2013 at 5:37 PM, Andreas Lehmkuehler <
>> > andreas@lehmi.de
>> > > > >wrote:
>> > > > >
>> > > > >> Hi,
>> > > > >>
>> > > > >> Am 10.06.2013 11:15, schrieb Robin Thomas Panicker:
>> > > > >>
>> > > > >>  Thanks a lot Gilad, for responding. I was not sure on what
more
>> > > > >>> information
>> > > > >>> to provide. Now that you have asked me the specific details,
>> let me
>> > > > >>> provide
>> > > > >>> you with more information.
>> > > > >>>
>> > > > >>> I am using the below code to do the conversion of PDF
- image.
>> > > (Trying
>> > > > to
>> > > > >>> save the first page of the pdf as an image file)
>> > > > >>>
>> > > > >>>   String pdfFile ="d:/hs/4.pdf";
>> > > > >>>   document = PDDocument.load( pdfFile );
>> > > > >>>
>> > > > >>>              List pages =
>> > > > document.getDocumentCatalog().**getAllPages();
>> > > > >>>              PDPage page = ( PDPage ) pages.get( 0 );
>> > > > >>>              int width = ( int ) page.getArtBox().getWidth();
>> > > > >>>              int height = ( int ) page.getArtBox().getHeight();
>> > > > >>>              BufferedImage image = page.convertToImage(
>> imageType,
>> > > > >>> resolution );
>> > > > >>>
>> > > > >>>
>> > > > >>> On a machine (prod server) where the conversion DOES
NOT work, I
>> > have
>> > > > >>> Ubuntu 12.4, open office 3.0
>> > > > >>> while on a machine (development machine) where the conversion
>> > works,
>> > > I
>> > > > >>> have
>> > > > >>> Ubuntu 10.10 and open office 3.0
>> > > > >>>
>> > > > >>> On both the machines I am using the same code and version
of
>> PDFBox
>> > > on
>> > > > >>> both
>> > > > >>> is 1.8.1
>> > > > >>>
>> > > > >>> The issue that I face is that the image conversion simply
doesnt
>> > work
>> > > > >>> correctly ( I can see parts of image / text garbled,
or missing)
>> > > There
>> > > > is
>> > > > >>> no error or warning on the log outputs.
>> > > > >>>
>> > > > >>> Please let me know if I can provide you with any more
>> information
>> > in
>> > > > >>> understanding the problem
>> > > > >>>
>> > > > >> Without a sample pdf this is just a guess:
>> > > > >>
>> > > > >> The fact that you are using open office 3.0 leads to the
>> assumption
>> > > that
>> > > > >> the pdf
>> > > > >> in question contains fonts as embedded subsets. Those are
not
>> fully
>> > > > >> supported
>> > > > >> by PDFBox. There are different issues with those kind of
fonts.
>> > > > >> As you are using different platforms (Ubuntu 10.10 vs 12.04)
you
>> are
>> > > > most
>> > > > >> likely
>> > > > >> using different versions of the JDK (1.6 vs 1.7). There are
some
>> 1.7
>> > > > >> specific
>> > > > >> issues with embedded font subsets.
>> > > > >>
>> > > > >>
>> > > > >>  Thanks,
>> > > > >>> Robin
>> > > > >>>
>> > > > >>>
>> > > > >>>
>> > > > >>> On Mon, Jun 10, 2013 at 2:25 PM, Gilad Denneboom
>> > > > >>> <gilad.denneboom@gmail.com>**wrote:
>> > > > >>>
>> > > > >>>  A lof of information missing, there... How are you converting
>> the
>> > > PDF
>> > > > >>>> files, exactly? What type of problems do you encounter?
Which
>> > > version
>> > > > of
>> > > > >>>> PDFBox do you use? And what does it have to do with
your Office
>> > > suite
>> > > > >>>>
>> > > > >>>> Without more information it's impossible to help
you with your
>> > > > problem.
>> > > > >>>>
>> > > > >>>>
>> > > > >>>> On Mon, Jun 10, 2013 at 8:22 AM, Robin Thomas Panicker
<
>> > > > >>>> robin@qburst.com
>> > > > >>>>
>> > > > >>>>> wrote:
>> > > > >>>>>
>> > > > >>>>
>> > > > >>>>  Hi,
>> > > > >>>>>           I am using PDFBox to convert PDF documents
into
>> images.
>> > > > >>>>> However
>> > > > >>>>>
>> > > > >>>> in
>> > > > >>>>
>> > > > >>>>> some machines I am facing an issue. The conversion
does not
>> > happen
>> > > > >>>>>
>> > > > >>>> correct.
>> > > > >>>>
>> > > > >>>>> I can see missing text / images etc.
>> > > > >>>>>
>> > > > >>>>> Please note that this happens only in a few machines.
I use
>> > Ubuntu
>> > > > and
>> > > > >>>>> OpenOffice. I have tried with a variety of combinations
for
>> > > > difference
>> > > > >>>>> version of Ubuntu and Openoffice (and even LibreOffice)
>> > > > >>>>>
>> > > > >>>>> However I am unable to find out why it does not
work on some
>> > > > machines.
>> > > > >>>>>
>> > > > >>>>> Can anyone please help?
>> > > > >>>>>
>> > > > >>>>> Thanks,
>> > > > >>>>> Robin
>> > > > >>>>>
>> > > > >>>>
>> > > > >> BR
>> > > > >> Andreas Lehmkühler
>> > > > >>
>> > > > >>
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Robin Panicker,
>> > > > > Q*Burst*
>> > > > > www.qburst.com
>> > > > > Skype: Robin.at.qburst
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Robin Panicker,
>> > > Q*Burst*
>> > > www.qburst.com
>> > > Skype: Robin.at.qburst
>> > >
>> >
>>
>>
>>
>> --
>>
>> Robin Panicker,
>> Q*Burst*
>> www.qburst.com
>> Skype: Robin.at.qburst
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message