perl-docs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stas Bekman <>
Subject Re: adding search?
Date Sat, 26 Jan 2002 16:00:27 GMT

>>If you remember we've been through this discussion before when we were 
>>looking for the search engine for the guide. And these two: nextrieve 
>>and swish-e were found the best options:
>>The main criteria was being able to search for perl code. Well, you 
>>remember this right? Or we could dig up the thread from a year ago or so.
> I remember the discussion we had.  You asked me to get the swish config
> file from Randy and IIRC, it was just a standard setup.


> With swish, you define at indexing time what makes up a word.  Text is a
> lot easier, of course, than code, especially if people use different coding
> styles.
> We could create a second index that uses white space only to separate
> words, which might make searching perl code a bit easier.  It would be
> helpful to see what kind of things to search for.
> But then if you were looking for $| you could find "$| = 1;" but not "$|++".

that's not good then.

> Or, perhaps, have a mode that simply uses a perl regular expression and do
> a brut force grep search.  Slow, but the site is not that large, especially
> if it was limited to just the docs section.

I think you underestimate the size of the site:

% find src -name "*pod" | xargs du -c |grep total
% find src -name "*pod" | wc -l

so we have about 3MB of source code in 134 files (and will be more 
likely 6MB, when 2.0 docs are done, with 200+ files). Do you think it's 
possible to grep through in a reasonable response time? Remember that 
there will be a lot of IO for opening and closing many files.

> All the reverse indexing engines will parse on indexing, so it will always
> be an issue of defining what makes up a word.
> Let me ask Avi Rappoport if there's something good for searching code.

I think that Randy's setup was quite satisfying, but nextrieve was even 
better. What do you think about nextrieve?

Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker      mod_perl Guide

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message