pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Regarding: How To extract Hindi Language Text from PDF File Or Other Indian Language Text from PDF File
Date Wed, 12 Apr 2017 16:57:35 GMT
Am 12.04.2017 um 06:12 schrieb ajit.more@atishay.com:
> Dear Sir,
>
> I am Having problem in extracting Hindi text from pdf file.
>
> Actually I want to extract any Indian language text from pdf .
>
> I am using pdfbox library for .net. I am doing this is Dot Net.
>
> So Please kindly let me know regarding solution for All Unicode language text extraction
for PDF.
>

Be aware that the .net ports are unofficial and probably outdated. Here 
are some examples for text extraction:
https://pdfbox.apache.org/1.8/cookbook/textextraction.html
https://stackoverflow.com/questions/23813727/how-to-extract-text-from-a-pdf-file-with-apache-pdfbox

Tilman

PS: please do not subscribe to the dev list. Please do not post to the 
commits list. (Where did you get that idea?!) And please subscribe to 
the user list, or you won't see this answer.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message