pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hesham Gneady" <heshamgne...@gmail.com>
Subject RE: Extract bold text from a PDF file
Date Mon, 18 Mar 2019 11:12:34 GMT
I have 100s of PDF files used!

There must be some property used in my attached PDF file that cause the bold font, not just
the font type used! .. I see properties like ForceBold() but it’s set to false too .. I
mean; something like that?

 

 

Best regards,

Hesham 

 

--------------------------------------------------------------------------------------------------

Included Message:

 

Instead of a partial match for the name you could compile a list of all the names of the bold
variants of your fonts, and then compare the font name to that list.

 

On Mon, Mar 18, 2019 at 11:13 AM Hesham Gneady < <mailto:heshamgneady@gmail.com>
heshamgneady@gmail.com>

wrote:

 

> Hello ,

> 

> 

> 

> I am trying to extract the bold text for some PDF files, but some fail 

> like this one:

> 

>  <https://www.dropbox.com/s/gh2zwdh3sl3isck/Bold%20Font%20Sample.pdf?dl> https://www.dropbox.com/s/gh2zwdh3sl3isck/Bold%20Font%20Sample.pdf?dl=

> 0

> 

> 

> 

> I am overriding the processTextPosition (.) method to do this, and i 

> have tried all these options, but none has worked for me:

> 

> 1.      if(

> text.getFont().getFontDescriptor().getFontName().toLowerCase().contain

> s(

> "bold" ) ) {.}  // returns false.

> 2.      if( text.getFont().getName().toLowerCase().contains( "bold" )  {.}

> // returns false.

> 3.      System.out.println(

> text.getFont().getFontDescriptor().getFontWeight() );  // returns 0.0.

> 4.      System.out.println( getGraphicsState().getLineWidth() );  //

> returns

> 1.0.

> 5.      System.out.println(

> getGraphicsState().getTextState().getRenderingMode() );  // returns 

> FILL

> 

> 

> 

> Note: The font name for the bold text in the PDF file is 

> "frutigernextlt-heavycn". It has the word "heavy". I could detect it 

> this way, but I think this is not a right procedure, as I have other 

> PDF files with font names that have the "heavy" word while they're not bold.

> 

> 

> 

> Best regards,

> 

> Hesham

> 

> 

> 

> 

> 

> 

> 

> 

> 

> 

> 

> ---

> This email has been checked for viruses by Avast antivirus software.

>  <https://www.avast.com/antivirus> https://www.avast.com/antivirus

> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message