pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Thomas Panicker <ro...@qburst.com>
Subject Re: Issue with PDF - Image conversion
Date Tue, 18 Jun 2013 12:30:21 GMT
Sorry about that Gilad.
I have uploaded the same
here<https://www.dropbox.com/sh/ujrgmh47zku0zm9/h8z_4SR3Aw>

Hope this helps,

Thanks,
Robin



On Tue, Jun 18, 2013 at 5:41 PM, Gilad Denneboom
<gilad.denneboom@gmail.com>wrote:

> I'm not seeing any attachments... It's possible the mailing list doesn't
> allow them. You can upload them to some file-sharing site and post the
> links here.
>
>
> On Tue, Jun 18, 2013 at 7:38 AM, Robin Thomas Panicker <robin@qburst.com
> >wrote:
>
> > Thanks a lot Gilad and Andreas,
> > I was out of town last week and hence could not reply.
> >
> > I have attached the sample PDF and the image generated (only for the
> first
> > page)
> >
> > If you notice the original pdf and the converted image,  the words "The
> > pressures" and "The solution" is not coming correctly in the converted
> > image. The rest of the image looks fine.
> >
> > I have also attached a very very crude java code that does a standalone
> > task of converting this pdf into image.
> >
> > Can you please let me know what could be possibly causing the image
> issue?
> >
> > Thanks,
> > Robin
> >
> >
> >
> >
> >
> > On Tue, Jun 11, 2013 at 5:37 PM, Andreas Lehmkuehler <andreas@lehmi.de
> >wrote:
> >
> >> Hi,
> >>
> >> Am 10.06.2013 11:15, schrieb Robin Thomas Panicker:
> >>
> >>  Thanks a lot Gilad, for responding. I was not sure on what more
> >>> information
> >>> to provide. Now that you have asked me the specific details, let me
> >>> provide
> >>> you with more information.
> >>>
> >>> I am using the below code to do the conversion of PDF - image. (Trying
> to
> >>> save the first page of the pdf as an image file)
> >>>
> >>>   String pdfFile ="d:/hs/4.pdf";
> >>>   document = PDDocument.load( pdfFile );
> >>>
> >>>              List pages =
> document.getDocumentCatalog().**getAllPages();
> >>>              PDPage page = ( PDPage ) pages.get( 0 );
> >>>              int width = ( int ) page.getArtBox().getWidth();
> >>>              int height = ( int ) page.getArtBox().getHeight();
> >>>              BufferedImage image = page.convertToImage( imageType,
> >>> resolution );
> >>>
> >>>
> >>> On a machine (prod server) where the conversion DOES NOT work, I have
> >>> Ubuntu 12.4, open office 3.0
> >>> while on a machine (development machine) where the conversion works, I
> >>> have
> >>> Ubuntu 10.10 and open office 3.0
> >>>
> >>> On both the machines I am using the same code and version of PDFBox on
> >>> both
> >>> is 1.8.1
> >>>
> >>> The issue that I face is that the image conversion simply doesnt work
> >>> correctly ( I can see parts of image / text garbled, or missing) There
> is
> >>> no error or warning on the log outputs.
> >>>
> >>> Please let me know if I can provide you with any more information in
> >>> understanding the problem
> >>>
> >> Without a sample pdf this is just a guess:
> >>
> >> The fact that you are using open office 3.0 leads to the assumption that
> >> the pdf
> >> in question contains fonts as embedded subsets. Those are not fully
> >> supported
> >> by PDFBox. There are different issues with those kind of fonts.
> >> As you are using different platforms (Ubuntu 10.10 vs 12.04) you are
> most
> >> likely
> >> using different versions of the JDK (1.6 vs 1.7). There are some 1.7
> >> specific
> >> issues with embedded font subsets.
> >>
> >>
> >>  Thanks,
> >>> Robin
> >>>
> >>>
> >>>
> >>> On Mon, Jun 10, 2013 at 2:25 PM, Gilad Denneboom
> >>> <gilad.denneboom@gmail.com>**wrote:
> >>>
> >>>  A lof of information missing, there... How are you converting the PDF
> >>>> files, exactly? What type of problems do you encounter? Which version
> of
> >>>> PDFBox do you use? And what does it have to do with your Office suite
> >>>>
> >>>> Without more information it's impossible to help you with your
> problem.
> >>>>
> >>>>
> >>>> On Mon, Jun 10, 2013 at 8:22 AM, Robin Thomas Panicker <
> >>>> robin@qburst.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>
> >>>>  Hi,
> >>>>>           I am using PDFBox to convert PDF documents into images.
> >>>>> However
> >>>>>
> >>>> in
> >>>>
> >>>>> some machines I am facing an issue. The conversion does not happen
> >>>>>
> >>>> correct.
> >>>>
> >>>>> I can see missing text / images etc.
> >>>>>
> >>>>> Please note that this happens only in a few machines. I use Ubuntu
> and
> >>>>> OpenOffice. I have tried with a variety of combinations for
> difference
> >>>>> version of Ubuntu and Openoffice (and even LibreOffice)
> >>>>>
> >>>>> However I am unable to find out why it does not work on some
> machines.
> >>>>>
> >>>>> Can anyone please help?
> >>>>>
> >>>>> Thanks,
> >>>>> Robin
> >>>>>
> >>>>
> >> BR
> >> Andreas Lehmkühler
> >>
> >>
> >
> >
> > --
> >
> > Robin Panicker,
> > Q*Burst*
> > www.qburst.com
> > Skype: Robin.at.qburst
> >
> >
>



-- 

Robin Panicker,
Q*Burst*
www.qburst.com
Skype: Robin.at.qburst

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message