lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kurz <n...@verse.com>
Subject Re: [lucy-user] Change default Boolean operator from OR to AND (via default_boolop)
Date Mon, 24 Oct 2011 05:35:35 GMT
On Sat, Oct 22, 2011 at 11:17 PM, goran kent <gorankent@gmail.com> wrote:
> In fact, google seems to go further.  My tests show that google
> changes the above query to:
>
> +site:test.com +bob

You're so last week about this, Goran! :)

Since we last wrote, Google has since changed their behaviour to
disallow '+required' in queries and now gives an error message telling
you to use "required" with double quotes:
https://news.ycombinator.com/item?id=3140797

That aside, I think it actually converted to ~bob, which I think still works.

> I'm trying to mimic the expected behaviour as closely as possible so
> as not to frustrate/alienate my users.

Seeing as Google has changed their long standing behaviour several
times (from all words required, to stemming allowed, to synonyms by
default, to making it quite hard to "get" "exact" "results") I
wouldn't worry too much about it.   Normal users don't ever use
special features, and even quoted phrases are only used by a tiny
minority.  Heck, I'm sure there a some users who never use multiple
terms.

Advanced users want it to work correctly, and don't really care what
Google is currently doing.

> So, fiddle with the query terms behind the scenes and transform them
> to +site:test.com +bob...  an idea which doesn't feel right.

Certainly the easiest approach.  Maybe do this until you can test it
with real users?

> ...or, change the default QueryParser behaviour from OR to AND:
>
> my $query_parser = Lucy::Search::QueryParser->new(
>    schema => $schema,
>    default_boolop => 'AND',
> );
>
> I have a feeling that google is defaulting to AND for most cases.

They used to, back when they catered to experienced users.  Currently,
they are a lot more free-form.  For the first time since Backrub, I'm
actively searching for a new search engine.

But like Marvin says, do what's right for your data set and your
users.  Personally, I'm a firm believer in AND, and that all non-exact
matches should be clearly marked as such.

--nate

Mime
View raw message