lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miroslaw Milewski <>
Subject Re: pdfbox performance.
Date Thu, 29 Jul 2004 14:19:15 GMT
Ben Litchfield wrote:

  > Different PDFs will exhibit different extraction speeds because of 
the way
  > that PDF documents are structured.

  Yes, I am aware of that - this is the reason I picked pdfs containting 
only text, arranged in one column. Anwyay, there probably are lots of 
different factors to consider, so the whole benchmark thing was greatly 
  All wanted to actually find out is whether the speed of extraction I 
encountered is 'standard' considering the system, the API version and my 
code. But then, considering the PDF structure and other factors, there 
may be no definitive answer.

  > I assume you are using the latest version 0.6.6, could you give 0.6.5 a
  > try and see if you notice faster speeds.

  Oh, yes, I forgot to specify the version. It is 0.6.6. I'll give the 
previous one a try.

	Miroslaw Milewski

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message