lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Carlson <carl...@bookandhammer.com>
Subject Re: Indexing options
Date Wed, 16 Oct 2002 13:50:21 GMT
Since you want the rendered JSP pages, I would recommend using a 
crawler to get the files and then index the resulting file with Lucene. 
The jsp vs. html file extension doesn't matter. You can index database 
content, xml files, html files, or any text data you want. However, you 
have to use or create a way to get that text data into a Lucene 
Document.

An integrated Lucene Crawler is LARM in the Lucene Sandbox area. I 
haven't used it yet, but other have and it sounds great.

I hope this helps

--Peter

On Tuesday, October 15, 2002, at 11:37 PM, Richard Doorduin wrote:

>
> Thanks Peter,
>
> I know that jsp is a dynamic page, that is the problem.
> but a jsp also contains static information, usualy this is html.
> If index html files, it also creates some keywords to the link.
> is that possible for jsp files to?
>
> and should i write my own java application to index jsp,
> or is it also possible to use the current application?
>
> thanks for helping,
>
> Richard
>
>
> -----Oorspronkelijk bericht-----
> Van: Peter Carlson [mailto:carlson@bookandhammer.com]
> Verzonden: woensdag 16 oktober 2002 8:06
> Aan: Lucene Users List
> Onderwerp: Re: Indexing options
>
>
> Lucene will index anything you ask it to index.
>
> In order to index content, you have to create a Lucene Document object.
> Lucene Document object is made up of Field's. Each field contains data.
>
> The html indexing example parses the html text and create appropriate
> Lucene Documents. If you have other content you want to index, just
> follow the same example.
>
> You question is a little strange since you want Lucene to index a
> dynamic page or are you asking to index the raw jsp file?
>
> I hope this helps a little.
>
> --Peter
> On Monday, October 14, 2002, at 11:53 PM, Richard Doorduin wrote:
>
>>> Dear developers@jakarta.apache,
>>>
>>> i'm looking forward to see come out the new version of lucene.
>>> it really is a handy tool.
>>>
>>> But the reason i wrote this e-mail is because i couldnt figure
>>> something
>>> out.
>>> at the moment lucene is only indexating .html sites.
>>> But wat "we" tomcat users also like to index are *.jsp files.
>>>
>>> i couldnt figure out where to define these extensions.
>>> or is it not yet  possible to do that?
>>>
>>> thanks for listening!
>>>
>>> Greetings,
>>>
>>>
>>> Richard Doorduin.
>>
>> --
>> To unsubscribe, e-mail:
>> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>> For additional commands, e-mail:
>> <mailto:lucene-user-help@jakarta.apache.org>
>>
>>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>
> --
> To unsubscribe, e-mail:   
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: 
> <mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message