From Dean Gaudet <>
Subject indexing tool...
Date Mon, 30 Mar 1998 01:18:04 GMT
Are there any indexing tools that have the following properties:

- the core executables are installed system-wide, but don't require any
root or special userid usage after that

- users can build their own search engines trivially without asking root
for help

I'm looking for something with instructions to the end user that go
something like this:

    Ok you want to index your whizbang website that sits at  First you have to choose where
    you will store the index database.  It shouldn't be stored in
    your web directories at all, storing it in your home directory is
    probably the best thing.  The create_index command will ask you
    a few questions and do the work:

	% create_index
	URL path to the website to be indexed:

	Filesystem path to store the database:

	I'll create you a search CGI and a sample html search form
	which you can modify to suit your needs.  They'll be
	called "ezsearch.cgi" and "ezsearch.html".  What directory
	would you like me to place them in?
	> public_html

	Ok cool!  I've created all the necessary files.  And I've
	taken the liberty of adding an entry to your crontab which
	will run nightly to update the index.  You can use
	"crontab -e" to remove it if you like, it's the line that
	looks like this:

	17 1 * * * /home/mememe/yippeeyay.index/reindex

I really can't imagine it being any more complex than that.  In this
example, public_html/search.cgi would just be a wrapper for a globally
installed CGI, it just needs to tell the global CGI where the database is.
And simimlarly reindex is just a sh script that invokes the indexing
program with the right arguments.

Am I dreaming?  Am I missing something?  Does this tool exist already?

I've looked at webglimpse ... but... I can't trust it at all, I mean,
come on, here's something it says during install:

    NOTE:  This portion of the install script might not work for all
    HTTP daemons.  Apache 2.x or higher should be supported.

    What is the path to your HTTP daemon's configuration directory
    (containing .conf files)?[/usr/local/etc/httpd/conf]:

Uh hello, what?  Apache 2.x doesn't exist.  And why is it mucking
around in my config files?  Hell I don't trust apache to parse virtual
hosts correctly, I'm certainly not going to trust some third-party to
do it right.  And it really seems to want me to configure it as root.
No, sorry, I can't see any need for root to be involved in this process
at all except to install some executables system wide.


