lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lu" <chris...@gmail.com>
Subject Re: factor in stopwords when searching
Date Sat, 22 Mar 2008 18:38:50 GMT
This is asked by some customer, who may not know what's "stop words" at all.

Jake's approach should be quite similar to what some search engine
companies are doing. It'll cost some storage, but can achieve a good
user experience.

The benefit is kind of obvious in real world. When users enter some
query, they do not really know stop words like "the" are not in the
index at all.
If they are looking for something, like a book titled "search the
database", other books like "search database" is ranked top, which is
not a good user experience.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request)
got 2.6 Million Euro funding!


On Sat, Mar 22, 2008 at 10:53 AM, Erick Erickson
<erickerickson@gmail.com> wrote:
> What's your reason for trying? The whole point of stop words is that
>  they should be considered "no ops". That is, they add nothing to the
>  semantics of whatever is being processed. I' don't understand the use
>  case for why you want to go outside that assumption.
>
>  Another way of asking this is "what tangible benefit to your users
>  are you trying to implement"?
>
>  Best
>  Erick
>
>
>
>  On Fri, Mar 21, 2008 at 9:20 PM, Chris Lu <chris.lu@gmail.com> wrote:
>
>  > Let's say "the" is considered stopword. And for example two documents are
>  > document A, content: "... search the database..."
>  > document B, content: "... search database..."
>  >
>  > So when the user's input is "search the database", searching with
>  > query content:"search database"~1 can return both.
>  > But is there any way to translate that into a query that can rank the
>  > document A higher than document B?
>  >
>  > Thanks!
>  >
>  > --
>  > Chris Lu
>  > -------------------------
>  > Instant Scalable Full-Text Search On Any Database/Application
>  > site: http://www.dbsight.net
>  > demo: http://search.dbsight.com
>  > Lucene Database Search in 3 minutes:
>  >
>  > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>  > DBSight customer, a shopping comparison site, (anonymous per request)
>  > got 2.6 Million Euro funding!
>  >
>
>
> > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  > For additional commands, e-mail: java-user-help@lucene.apache.org
>  >
>  >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message