jakarta-taglibs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abey Mullassery <a...@mullassery.com>
Subject PROPOSAL New Search Tag library
Date Thu, 08 Jul 2004 03:39:36 GMT
I) Motivation

With the huge amounts of information available, providing search is the 
best way to make access to information quickly. A search facility would 
soon be a must-have feature of any website of average size, whether it 
is database backed or a set of HTML/PDF/XML documents.

Hence there is an upcoming need for a search Tag library.


II) Overview

The exact tag names and design need to be worked out. But the basic 
usage scenarios are:-
1. Index
	a. plain text (streamed/ files)
	b. HTML/ XML/ PDF
	c. non-text with meta data (Images/ Flash)

2. Search
	a. use meta-data
	b. ranked results
	c. quick view/ result snippets

3. Managing
	a. Optimize
	b. Crawl
	c. Remove
	d. Update


III) Requirements

The search could be based on existing libraries such as:-
	a. Lucene
	b. JSearch (license issues??)
Lucene requires a set of Analyzers for HTML, PDF, MS Word, MS Excel, 
etc., and a crawler.


IV) Commitment

I just started working on developing a basic version for my own use.
But if we find it worthwhile to add it to the taglibs (sandbox) I could 
"restart" with discussions about the usage scenarios making it generic 
and base my development on that feedback. Thus I won't have to do it twice.


Let me know your views/ comments.

Abey Mullassery
http://www.mullassery.com



---------------------------------------------------------------------
To unsubscribe, e-mail: taglibs-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: taglibs-dev-help@jakarta.apache.org


Mime
View raw message