Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 1702 invoked from network); 3 May 2002 14:57:42 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 3 May 2002 14:57:42 -0000 Received: (qmail 25699 invoked by uid 97); 3 May 2002 14:57:42 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 25589 invoked by alias); 3 May 2002 14:57:41 -0000 Delivered-To: jakarta-archive-lucene-user@jakarta.apache.org Received: (qmail 25564 invoked by uid 97); 3 May 2002 14:57:41 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 25552 invoked by uid 98); 3 May 2002 14:57:40 -0000 X-Antivirus: nagoya (v4198 created Apr 24 2002) Date: Fri, 3 May 2002 16:57:46 +0200 Subject: Re: indexing PDF files Content-Type: text/plain; charset=US-ASCII; format=flowed Mime-Version: 1.0 (Apple Message framework v481) From: petite_abeille To: "Lucene Users List" Content-Transfer-Encoding: 7bit In-Reply-To: Message-Id: <2144BA73-5EA6-11D6-8199-000393760B7E@mac.com> X-Mailer: Apple Mail (2.481) X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Friday, May 3, 2002, at 03:16 PM, Moturu,Praveen wrote: > Can I assume none of the poeple on the lucene user group had > implemented indexing a pdf document using lucene. Who knows...?!? In any case, it's not public knowledge... > If some one has.. Please help me by providing the solution. I use to believe in Santa Claus also... ;-) All that said, there seems to be a real demand to do something about pdf to text conversion (in java preferably). I'm willing to invest some time and brain cell to nail it down, but I'm note sure where to start... I'm aware of the PJ library, but it's really a pig as far as resources goes. Anything else? Any (concrete) pointer appreciated. Thanks. PA. -- To unsubscribe, e-mail: For additional commands, e-mail: