pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudius Teodorescu <claudius.teodore...@gmail.com>
Subject Re: Rendering of a Devanagari text
Date Fri, 20 Jan 2017 14:03:54 GMT
Well, I hope I am not doing what you said, as I am editing in eclipse in
ubuntu, and I am compiling with maven as UTF-8.

On Fri, Jan 20, 2017 at 1:56 PM, Lachezar Dobrev <l.dobrev@gmail.com> wrote:

>   Apologies for being blunt, but seeing that you're mixing string
> literals and UNICODE escape sequences, I have to ask: are you *sure*
> you're using the same character set when editing the .java file and
> when compiling it? I've had discrepancies when editing the java file
> in one encoding (say UTF-8), but the automated build system uses ISO
> 8859-1, and literal non-Latin characters get mangled, sans those
> written as UNICODE escape sequences, since those are in the ASCII
> range.
>
> 2017-01-19 14:17 GMT+02:00 Claudius Teodorescu <
> claudius.teodorescu@gmail.com>:
> > So, I found the private use Unicode code for a ligature, and displayed
> it in
> > a PDF document by using the code:
> >
> > pageContentStream.showText("त्त्व is correctly displayed with glyph
> > substitution as " + "\ue10d");
> >
> > The result is in the attached file.
> >
> > So, it looks that what is needed is only the string to be rendered with
> all
> > the glyph substitution done. With this approach, the PDFBox is left
> > untouched.
> >
> >
> > Cheers from Heidelberg,
> > Claudius
> >
> > On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr <THausherr@t-online.de>
> > wrote:
> >>
> >> Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
> >>>
> >>> Well, I was just about to congratulate myself for fixing this with
> >>> PDFBox,
> >>> as FOP is returning good output, but with a character that is
> represented
> >>> in half.
> >>>
> >>> So, I guess I will need a text layout engine. What output of such
> engine
> >>> would be fit for PDFBox?
> >>
> >>
> >> In PDPageContentStream.showText there is this line:
> >>
> >> COSWriter.writeString(font.encode(text), getOutput());
> >>
> >> So you need to get that sequence... might be tricky as above that line
> >> there's the subsetting that also needs the correct codes. This is not a
> >> change that will be done within a few hours.
> >>
> >> Tilman
> >>
> >>
> >>
> >>>
> >>>
> >>> Thanks,
> >>> Claudius
> >>>
> >>> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <
> THausherr@t-online.de>
> >>> wrote:
> >>>
> >>>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
> >>>>
> >>>>> Its is not a big deal, but works for an awt component, but it is
not
> >>>>> related to that:
> >>>>>
> >>>>>           String s = "कारणत्त्वङ्गवाश्वादीनमपीति
चेत् युक्तम्";
> >>>>>           Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
> >>>>>           FontRenderContext frc = new FontRenderContext(new
> >>>>> AffineTransform(), true, true);
> >>>>>
> >>>>>           char[] chars = s.toCharArray();
> >>>>>           GlyphVector glyphVector = font2.layoutGlyphVector(frc,
> chars,
> >>>>> 0,
> >>>>> chars.length, 0);// createGlyphVector(frc, s);
> >>>>>
> >>>>>           int length = glyphVector.getNumGlyphs();
> >>>>>
> >>>>>           for (int i = 0; i < length; i++) {
> >>>>>             Shape glyph = glyphVector.getGlyphOutline(i);
> >>>>>             System.out.println(glyphVector.getGlyphCode(i));
> >>>>>           }
> >>>>>
> >>>>> Any pointers about where I can hook this in PDFBox?
> >>>>>
> >>>> Problem is we don't use the awt fonts anymore.
> >>>>
> >>>> Tilman
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Thanks,
> >>>>> Claudius
> >>>>>
> >>>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <
> andreas@lehmi.de>
> >>>>> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>>
> >>>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
> >>>>>>
> >>>>>> Hi,
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks for the answer, Tilman.
> >>>>>>>
> >>>>>>> I managed to get the Devanagari text exactly as it should,
by using
> >>>>>>> java.awt.font.layoutGlyphVector().
> >>>>>>>
> >>>>>>> Are they any chances to write a GlyphVector in a PDFBox
page?
> >>>>>>>
> >>>>>>> There was a discussion at [1] about using GlpyhVector, but
we
> didn't
> >>>>>>
> >>>>>> make
> >>>>>> any descision nor did we implement anything.
> >>>>>>
> >>>>>> Do you mimd to share some of your code as a possible starting
point?
> >>>>>>
> >>>>>> BR
> >>>>>> Andreas
> >>>>>>
> >>>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>>
> >>>>>>> Claudius
> >>>>>>>
> >>>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr
> >>>>>>> <THausherr@t-online.de
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> This is not supported, sorry. PDFBox just outputs the
glyphs for
> the
> >>>>>>>> single characters and does not replace for ligatures.
> >>>>>>>>
> >>>>>>>> Tilman
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>>> I am using pdfbox 2.0.4, and I am trying to output
a pdf document
> >>>>>>>>> with
> >>>>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति
चेत्
> >>>>>>>>> युक्तम्.
> >>>>>>>>>
> >>>>>>>>> The code is very simple:
> >>>>>>>>>       @Test
> >>>>>>>>>       public void testPdfBox() throws IOException
{
> >>>>>>>>>           PDDocument document = new PDDocument();
> >>>>>>>>>           PDPage page = new PDPage();
> >>>>>>>>>           document.addPage(page);
> >>>>>>>>>
> >>>>>>>>>           PDFont font = PDType0Font.load(document,
> >>>>>>>>>                   new File("/home/claudius/workspace
> >>>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
> >>>>>>>>>
> >>>>>>>>>           PDPageContentStream contentStream = new
> >>>>>>>>> PDPageContentStream(document, page);
> >>>>>>>>>
> >>>>>>>>>           contentStream.beginText();
> >>>>>>>>>           contentStream.setFont(font, 12);
> >>>>>>>>>           contentStream.moveTextPositionByAmount(100,
700);
> >>>>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति
चेत्
> युक्तम्");
> >>>>>>>>>           contentStream.endText();
> >>>>>>>>>
> >>>>>>>>>           // Make sure that the content stream is
closed:
> >>>>>>>>>           contentStream.close();
> >>>>>>>>>
> >>>>>>>>>           // Save the results and ensure that the
document is
> >>>>>>>>> properly
> >>>>>>>>> closed:
> >>>>>>>>>           document.save("target/" + name.getMethodName()
+
> ".pdf");
> >>>>>>>>>           document.close();
> >>>>>>>>>       }
> >>>>>>>>>
> >>>>>>>>> The output pdf file (attached) is not rendering
correctly the
> >>>>>>>>> string,
> >>>>>>>>> as
> >>>>>>>>> it is above. Namely, the ligatures are not displayed,
as if they
> do
> >>>>>>>>> not
> >>>>>>>>> exist. On the other hand, if I am copying the text
from the pdf
> >>>>>>>>> file,
> >>>>>>>>> and
> >>>>>>>>> paste it in eclipse, it shows perfectly.
> >>>>>>>>>
> >>>>>>>>> I checked the pdf output with evince, firefox, and
adobe reader
> 9,
> >>>>>>>>> in
> >>>>>>>>> ubuntu.
> >>>>>>>>>
> >>>>>>>>> Any idea on how to fix this display issue?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Claudius
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> http://kuberam.ro
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ------------------------------------------------------------
> ---------
> >>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> ------------------------------------------------------------
> ---------
> >>>>>>
> >>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>>
> >>>>>>
> >>>>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>
> >>>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>
> >
> >
> >
> > --
> > http://kuberam.ro
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
http://kuberam.ro

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message