Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 98961 invoked from network); 19 Aug 2004 10:20:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 19 Aug 2004 10:20:23 -0000 Received: (qmail 33778 invoked by uid 500); 19 Aug 2004 10:20:14 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 33698 invoked by uid 500); 19 Aug 2004 10:20:13 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 33685 invoked by uid 99); 19 Aug 2004 10:20:13 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [194.159.73.192] (HELO post-22.mail.nl.demon.net) (194.159.73.192) by apache.org (qpsmtpd/0.27.1) with ESMTP; Thu, 19 Aug 2004 03:20:11 -0700 Received: from mda.demon.nl ([212.238.156.229]:4306 helo=[127.0.0.1]) by post-22.mail.nl.demon.net with esmtp (Exim 4.34) id 1Bxk1f-000K0Y-MP; Thu, 19 Aug 2004 10:20:07 +0000 Message-ID: <41247F51.9020806@zilverline.org> Date: Thu, 19 Aug 2004 12:22:09 +0200 From: Zilverline info User-Agent: Mozilla Thunderbird 0.7.1 (Windows/20040626) X-Accept-Language: nl, en-us, en MIME-Version: 1.0 To: Lucene Users List Subject: Re: searchhelp References: <01c201c485d5$21b0ce20$4801a8c0@sprosys.com> <003201c485d5$c7ecce10$2403a8c0@neplaptop> In-Reply-To: <003201c485d5$c7ecce10$2403a8c0@neplaptop> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N The PDF and WORD stuff has been done too: have a look at http://www.zilverline.org. Michael Franken Chandan Tamrakar wrote: >For PDF you need to extract a text from pdf files using pdfbox library and >for word documents u can use apache POI api's . There are messages >posted on the lucene list related to your queries. About database ,i guess >someone must have done it . :) > >----- Original Message ----- >From: "Santosh" >To: >Sent: Thursday, August 19, 2004 3:58 PM >Subject: searchhelp > > >Hi, > >I am using lucene search engine for my application. > >i am able to search through the text files and htmls as specified by lucene > >can you please clarify my doubts > >1.can lucene search through pdfs and word documents? if yes then how? > >2.can lucene search through database ? if yes then how? > >thankyou > >santosh > > >-----------------------SOFTPRO DISCLAIMER------------------------------ > >Information contained in this E-MAIL and any attachments are >confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' >and 'confidential'. > >If you are not an intended or authorised recipient of this E-MAIL or >have received it in error, You are notified that any use, copying or >dissemination of the information contained in this E-MAIL in any >manner whatsoever is strictly prohibited. Please delete it immediately >and notify the sender by E-MAIL. > >In such a case reading, reproducing, printing or further dissemination >of this E-MAIL is strictly prohibited and may be unlawful. > >SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment >hereto is free from computer viruses or other defects. > >The opinions expressed in this E-MAIL and any ATTACHEMENTS may be >those of the author and are not necessarily those of SOFTPRO SYSTEMS. >------------------------------------------------------------------------ > > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org