httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wilson <>
Subject Re: indexing suggestion
Date Wed, 12 Apr 1995 17:35:57 GMT
> Andy suggested WAIS and glimpse. This is something different -
> the resource owners decide what goes into the index, and how it
> is described (the ALIWEB approach). The index files will typically
> be small

WAIS (definately - not sure about glimpse) can index anything.  The ideal
solution (if you used WAIS) would be for authors to decide what they wanted
to appear in the robsownformat.idx files in each of their directories.  WAIS
would then index the robsownformat.idx files - *not* the entire *.html space.
Authors can add their own keywords etc, etc, etc.

Authors would still get the final say about what was searchable - it would
*NOT* be an indexing of ALL the files that the server (or server admin)
knew about.

> If all of this was expensive, I too would have my doubts, and would
> suggest it be CGI'ed. But it's so easy to bolt on to the existing code,
> and by being based on simple format and searching principles, it shouldn't
> have an impact on server performance.
> w.r.t robots, they could ask for the raw index file and use that to
> build an ALIWEB style of index - one which is far superior to existing
> "grab everything and guess" robot indexing techniques.
> Someone told be the other day that it's pointless just saying "it's easy".
> Implement it and show people how easy it was.  I will have a crack at it
> today.

It's easy.  The server's already doing most of the hard work - directory
hopping, looking for files etc.  The point is do you want Apache to
hardcode a preference for any given .idx format, when a sexy PERL script
and a decent Makefile (yeah with WAIS or whatever) can do the same thing?

> robh

[I've done the WAIS thing already Rob, but go for it anyhow]


View raw message