lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasad KVSH" <Prasad.Kokep...@ness.com>
Subject RE: lucene-3.0.3
Date Wed, 01 Feb 2012 13:51:17 GMT
Hi Karthik,

I appreciate your quick response.

I guess the next question is how to do strip the text from
PDF/HTML/XML/MSword/PPT/XLS and where it will store for indexing. 

What are the other scenarios (like adding files, deleting files) where
we need to execute indexfiles.classs.  

Thanks
Prasad

-----Original Message-----
From: KARTHIK SHIVAKUMAR [mailto:nskarthik.k@gmail.com] 
Sent: Wednesday, February 01, 2012 7:04 PM
To: java-user@lucene.apache.org
Subject: Re: lucene-3.0.3

Hi

>>lucene-3.0.3 can be used for searching a text from

Lucene 's primary job is to do a text search.

May it be PDF/HTML/XML/MSword/PPT/XLS

U have to have the code for plugin to do 2 things

1) Strip text from either of the Documents (PDF/HTML/XML/MSword/PPT/XLS)
2) Index this processed text using Lucene

The indexed process can be later used for Searching thru the required
content.

;)
with regards
karthik


On Wed, Feb 1, 2012 at 6:37 PM, Prasad KVSH
<Prasad.Kokepudi@ness.com>wrote:

> Hi,
>
>
>
> lucene-3.0.3 can be used for searching a text from PDF, xlsx, docx, 
> doc, xls, msg, TXT files. For this we have any common function to 
> accomplish this. Please help me on this.
>
>
>
> Thanks
>
> Prasad
>
>
>
>


--
*N.S.KARTHIK
R.M.S.COLONY
BEHIND BANK OF INDIA
R.M.V 2ND STAGE
BANGALORE
560094*

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message