lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-u...@tropo.com>
Subject Re: Incremental Search experiment with Lucene, sort of like the new Google Suggestion page
Date Sat, 11 Dec 2004 18:09:38 GMT
Chris Lamprecht wrote:

> Very cool, thanks for posting this!  
> 
> Google's feature doesn't seem to do a search on every keystroke
> necessarily.  Instead, it waits until you haven't typed a character
> for a short period (I'm guessing about 100 or 150 milliseconds). 

Ohh, good point - I was wondering how to "cancel" a URL - this is the 
right way. I'll try to code that in.

I also realized they're prob not doing searches at all - instead they're 
going off a DB of query popularity - I wanted to code up something 
generic, based just on term frequency but I don't think it'll be useful 
e.g. let's say
in my index (index of javadoc-generated documentation) the user types in
"hash" - well a human might guess that they intend "hash map" or 
"hashmap" or "hash tree" but I'm sure other terms are more frequent in 
my index than "map" and "tree"...I'm sure "hash java" occurs more 
frequently than "hash map" - or any other freq, non-stop word, and it's 
dubious that "hash java" is a useful suggestion...

> So
> if you type fast, it doesn't hit the server until you pause.  There
> are some more detailed postings on slashdot about how it works.
> 
> On Fri, 10 Dec 2004 16:36:27 -0800, David Spencer
> <dave-lucene-user@tropo.com> wrote:
> 
>>Google just came out with a page that gives you feedback as to how many
>>pages will match your query and variations on it:
>>
>>http://www.google.com/webhp?complete=1&hl=en
>>
>>I had an unexposed experiment I had done with Lucene a few months ago
>>that this has inspired me to expose - it's not the same, but it's
>>similar in that as you type in a query you're given *immediate* feedback
>>as to how many pages match.
>>
>>Try it here: http://www.searchmorph.com/kat/isearch.html
>>
>>This is my "SearchMorph" site which has an index of ~90k pages of open
>>source javadoc packages.
>>
>>As you type in a query, on every keystroke it does at least one Lucene
>>search to show results in the bottom part of the page.
>>
>>It also gives spelling corrections (using my "NGramSpeller"
>>contribution) and also suggests popular tokens that start the same way
>>as your search query.
>>
>>For one way to see corrections in action, type in "rollback" character
>>by character (don't do a cut and paste).
>>
>>Note that:
>>-- this is not how the Google page works - just similar to it
>>-- I do single word suggestions while google does the more useful whole
>>phrase suggestions (TBD I'll try to copy them)
>>-- They do lots of javascript magic, whereas I use old school frames mostly
>>-- this is relatively expensive, as it does 1 query per character, and
>>when it's doing spelling correction there is even more work going on
>>-- this is just an experiment and the page may be unstable as I fool w/ it
>>
>>What's nice is when you get used to immediate results, going back to the
>>"batch" way of searching seems backward, slow, and old fashioned.
>>
>>There are too many idle CPUs in the world - this is one way to keep them
>>busier :)
>>
>>-- Dave
>>
>>PS Weblog entry updated too:
>>http://www.searchmorph.com/weblog/index.php?id=26
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message