perl-docs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stas Bekman <s...@stason.org>
Subject Re: adding search?
Date Sat, 26 Jan 2002 16:00:27 GMT

>>If you remember we've been through this discussion before when we were 
>>looking for the search engine for the guide. And these two: nextrieve 
>>and swish-e were found the best options:
>>
> 
>>The main criteria was being able to search for perl code. Well, you 
>>remember this right? Or we could dig up the thread from a year ago or so.
>>
> 
> I remember the discussion we had.  You asked me to get the swish config
> file from Randy and IIRC, it was just a standard setup.


Yup.


> With swish, you define at indexing time what makes up a word.  Text is a
> lot easier, of course, than code, especially if people use different coding
> styles.
> 
> We could create a second index that uses white space only to separate
> words, which might make searching perl code a bit easier.  It would be
> helpful to see what kind of things to search for.
> 
> But then if you were looking for $| you could find "$| = 1;" but not "$|++".


that's not good then.


> Or, perhaps, have a mode that simply uses a perl regular expression and do
> a brut force grep search.  Slow, but the site is not that large, especially
> if it was limited to just the docs section.


I think you underestimate the size of the site:

% find src -name "*pod" | xargs du -c |grep total
3172 
total
% find src -name "*pod" | wc -l
     134

so we have about 3MB of source code in 134 files (and will be more 
likely 6MB, when 2.0 docs are done, with 200+ files). Do you think it's 
possible to grep through in a reasonable response time? Remember that 
there will be a lot of IO for opening and closing many files.


> All the reverse indexing engines will parse on indexing, so it will always
> be an issue of defining what makes up a word.
> 
> Let me ask Avi Rappoport if there's something good for searching code.

I think that Randy's setup was quite satisfying, but nextrieve was even 
better. What do you think about nextrieve?

_____________________________________________________________________
Stas Bekman             JAm_pH      --   Just Another mod_perl Hacker
http://stason.org/      mod_perl Guide   http://perl.apache.org/guide
mailto:stas@stason.org  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-dev-unsubscribe@perl.apache.org
For additional commands, e-mail: docs-dev-help@perl.apache.org


Mime
View raw message