lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Lucene and Struts
Date Fri, 12 Sep 2003 20:20:54 GMT
On Friday, September 12, 2003, at 02:52  PM, Robert Taylor wrote:
> I agree that this makes indexing rather straight forward, but....
> then I have to build/use a content management system for my
> existing web application(s). That's not going to fly for me right now.
> Maybe in the future - but for now, I need to provide content-searching
> capability for my web applications.

You can roll your own poor-man's content management system that would 
fit into your web apps just fine.  If the content you truly want 
indexed is one big block of text, for example, just make .txt files for 
each block and then <jsp:include> it where appropriate.  Sure, this 
will involve some connection between the URL and the .txt file.  
Indexing with Lucene then becomes trivial.

> Why is indexing the site using a crawler not such a good idea for web
> applications
> where 99% of the content is static?

I just think it is more of a last resort type of solution, but 
certainly it'll work.

>  If the content changes, you just rebuild
> the indexes. This solution works if I have a legacy web site or one 
> deployed
> as a web application. It works if my pages are composite or atomic. It 
> works
> if
> my content is dynamic or static.

Sure.  But you might also be missing metadata, if that is an important 
piece.  Who is the author of the content?  When was it created?  etc.  
This is all information that you'd lose by crawling, unless you encode 
that with meta tags somehow, or can parse it out.

	Erik


Mime
View raw message