pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brzrk One <brz...@gmail.com>
Subject Re: TXT2PDF
Date Tue, 29 Jul 2014 16:49:25 GMT
You might consider putting the loop inside a java main.
Not only are you suffering from the use of 'read line' (which often spawns
a shell),
you are parsing 'line' 2x, and
you are suffering from the java startup/teardown for each line.
You could have your main() do all of this for you.
Then, of course, you could run this on multiple machines, or in multiple
processes, etc, etc.


On Tue, Jul 29, 2014 at 12:13 PM, Daniel Gibby <dgibby@edirectpublishing.com
> wrote:

> It sounds to me that converting 2000 files in an hour is pretty good...
> 1.8 seconds per file.
>
> My suggestion is put the files on more than one computer and run them
> simultaneously. If you have a million files, you know it is going to take a
> long time to create PDFs out of them.
> You'll save much more time by splitting up the load into multiple
> computers than you will with fiddling with anything below.
>
> Thanks,
> Daniel Gibby
>
>
>  <mailto:dgibby@edirectpublishing.com>On 7/29/2014 9:15 AM, Basharat Ali
> wrote:
>
>  Hi,
>> I am using the PDFBOX utility to convert TXT to PDF files. I have
>> developed script as under:
>>
>> echo " Remove Old TXT File List " >> $LogFileDir/ConvertTxtToPdf.log
>> rm $ConversionScriptDir/TxtFileList.out
>> echo " Remove Old PDF File List " >> $LogFileDir/ConvertTxtToPdf.log
>> rm $ConversionScriptDir/PDFFileslist.out
>> echo " Make List of TXT Files we are going to convert to PDF " >>
>> $LogFileDir/ConvertTxtToPdf.log
>> ls -a $TxtFilesDir|grep .TXT > $ConversionScriptDir/TxtFileList.out
>> echo " TXT File Listing is Complete " >> $LogFileDir/ConvertTxtToPdf.log
>> echo " Reading TXT File Listing " >> $LogFileDir/ConvertTxtToPdf.log
>> touch $ConversionScriptDir/PDFFileslist.out
>> while read line;
>> do
>>       PDFOutFile=`echo $line|cut -d '.' -f 1`
>>       java -jar $PdfConvertorDir/pdfbox-app-1.8.6.jar TextToPDF
>> $PdfFilesDir/$PDFOutFile.PDF $TxtFilesDir/$line
>>       echo " TXT File Converted to PDF = $line " >> $ConversionScriptDir/
>> PDFFileslist.out
>> done < $ConversionScriptDir/TxtFileList.out
>> echo " All TXT to PDF Conversion is completed successfully. Please verify
>> the PDF Files at:: $PdfFilesDir "
>>
>>
>> This is taking about 1 hour to convert 2000 files. I have about 1 million
>> such files so it means it will take 500 hours. Can we have some quicker
>> solution to convert the TXT files to PDF in less time.
>> Thanks
>> Bash
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message