pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilad Denneboom <gilad.denneb...@gmail.com>
Subject Re: Issue with PDF - Image conversion
Date Tue, 18 Jun 2013 12:40:49 GMT
Seems like it might be a fonts issue... Try embedding the full font instead
of just the subset when generating the file.


On Tue, Jun 18, 2013 at 2:30 PM, Robin Thomas Panicker <robin@qburst.com>wrote:

> Sorry about that Gilad.
> I have uploaded the same
> here<https://www.dropbox.com/sh/ujrgmh47zku0zm9/h8z_4SR3Aw>
>
> Hope this helps,
>
> Thanks,
> Robin
>
>
>
> On Tue, Jun 18, 2013 at 5:41 PM, Gilad Denneboom
> <gilad.denneboom@gmail.com>wrote:
>
> > I'm not seeing any attachments... It's possible the mailing list doesn't
> > allow them. You can upload them to some file-sharing site and post the
> > links here.
> >
> >
> > On Tue, Jun 18, 2013 at 7:38 AM, Robin Thomas Panicker <robin@qburst.com
> > >wrote:
> >
> > > Thanks a lot Gilad and Andreas,
> > > I was out of town last week and hence could not reply.
> > >
> > > I have attached the sample PDF and the image generated (only for the
> > first
> > > page)
> > >
> > > If you notice the original pdf and the converted image,  the words "The
> > > pressures" and "The solution" is not coming correctly in the converted
> > > image. The rest of the image looks fine.
> > >
> > > I have also attached a very very crude java code that does a standalone
> > > task of converting this pdf into image.
> > >
> > > Can you please let me know what could be possibly causing the image
> > issue?
> > >
> > > Thanks,
> > > Robin
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jun 11, 2013 at 5:37 PM, Andreas Lehmkuehler <andreas@lehmi.de
> > >wrote:
> > >
> > >> Hi,
> > >>
> > >> Am 10.06.2013 11:15, schrieb Robin Thomas Panicker:
> > >>
> > >>  Thanks a lot Gilad, for responding. I was not sure on what more
> > >>> information
> > >>> to provide. Now that you have asked me the specific details, let me
> > >>> provide
> > >>> you with more information.
> > >>>
> > >>> I am using the below code to do the conversion of PDF - image.
> (Trying
> > to
> > >>> save the first page of the pdf as an image file)
> > >>>
> > >>>   String pdfFile ="d:/hs/4.pdf";
> > >>>   document = PDDocument.load( pdfFile );
> > >>>
> > >>>              List pages =
> > document.getDocumentCatalog().**getAllPages();
> > >>>              PDPage page = ( PDPage ) pages.get( 0 );
> > >>>              int width = ( int ) page.getArtBox().getWidth();
> > >>>              int height = ( int ) page.getArtBox().getHeight();
> > >>>              BufferedImage image = page.convertToImage( imageType,
> > >>> resolution );
> > >>>
> > >>>
> > >>> On a machine (prod server) where the conversion DOES NOT work, I have
> > >>> Ubuntu 12.4, open office 3.0
> > >>> while on a machine (development machine) where the conversion works,
> I
> > >>> have
> > >>> Ubuntu 10.10 and open office 3.0
> > >>>
> > >>> On both the machines I am using the same code and version of PDFBox
> on
> > >>> both
> > >>> is 1.8.1
> > >>>
> > >>> The issue that I face is that the image conversion simply doesnt work
> > >>> correctly ( I can see parts of image / text garbled, or missing)
> There
> > is
> > >>> no error or warning on the log outputs.
> > >>>
> > >>> Please let me know if I can provide you with any more information in
> > >>> understanding the problem
> > >>>
> > >> Without a sample pdf this is just a guess:
> > >>
> > >> The fact that you are using open office 3.0 leads to the assumption
> that
> > >> the pdf
> > >> in question contains fonts as embedded subsets. Those are not fully
> > >> supported
> > >> by PDFBox. There are different issues with those kind of fonts.
> > >> As you are using different platforms (Ubuntu 10.10 vs 12.04) you are
> > most
> > >> likely
> > >> using different versions of the JDK (1.6 vs 1.7). There are some 1.7
> > >> specific
> > >> issues with embedded font subsets.
> > >>
> > >>
> > >>  Thanks,
> > >>> Robin
> > >>>
> > >>>
> > >>>
> > >>> On Mon, Jun 10, 2013 at 2:25 PM, Gilad Denneboom
> > >>> <gilad.denneboom@gmail.com>**wrote:
> > >>>
> > >>>  A lof of information missing, there... How are you converting the
> PDF
> > >>>> files, exactly? What type of problems do you encounter? Which
> version
> > of
> > >>>> PDFBox do you use? And what does it have to do with your Office
> suite
> > >>>>
> > >>>> Without more information it's impossible to help you with your
> > problem.
> > >>>>
> > >>>>
> > >>>> On Mon, Jun 10, 2013 at 8:22 AM, Robin Thomas Panicker <
> > >>>> robin@qburst.com
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>
> > >>>>  Hi,
> > >>>>>           I am using PDFBox to convert PDF documents into images.
> > >>>>> However
> > >>>>>
> > >>>> in
> > >>>>
> > >>>>> some machines I am facing an issue. The conversion does not
happen
> > >>>>>
> > >>>> correct.
> > >>>>
> > >>>>> I can see missing text / images etc.
> > >>>>>
> > >>>>> Please note that this happens only in a few machines. I use
Ubuntu
> > and
> > >>>>> OpenOffice. I have tried with a variety of combinations for
> > difference
> > >>>>> version of Ubuntu and Openoffice (and even LibreOffice)
> > >>>>>
> > >>>>> However I am unable to find out why it does not work on some
> > machines.
> > >>>>>
> > >>>>> Can anyone please help?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Robin
> > >>>>>
> > >>>>
> > >> BR
> > >> Andreas Lehmkühler
> > >>
> > >>
> > >
> > >
> > > --
> > >
> > > Robin Panicker,
> > > Q*Burst*
> > > www.qburst.com
> > > Skype: Robin.at.qburst
> > >
> > >
> >
>
>
>
> --
>
> Robin Panicker,
> Q*Burst*
> www.qburst.com
> Skype: Robin.at.qburst
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message