lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley" <ysee...@gmail.com>
Subject Re: Lucene 1.9 RC1 release available
Date Tue, 21 Feb 2006 23:27:50 GMT
Terry,

I think most of the examples you provide are normally handled via stemming.
Using wildcarding for stemming will normally be less accurate.

The current behavior is also consistent with the way file globbing works.

-Yonik


On 2/21/06, Terry Steichen <terry@net-frame.com> wrote:
> Yonik,
>
> No, I don't think that the riot* option would work for many queries.
> Let's take a simple case where you want a singular or plural form, like
> either cat or cats (which would be very common).  With 1.4.x, you can
> use cat? to retrieve such matches.  With the new change, you need to use
> (cat cats) or (cat cat?).  If you use cat*, you'll get a million matches
> you don't want (cater, catches, catwoman, category, catatonic,
> cataclysm, catamount, etc.).  Or, take a case where you want to retrieve
> terms like elder, elderly, elders but do not want things like
> elderberry, elderdice.  Or you want gun or guns, but not gunmen,
> gunshots, gunfire, gunpoint, gunston, etc.
>
> In contrast, as you appear to agree, it would actually be a fairly rare
> case where you really need a specific number of characters in the term.
>
> So, I would opt to either leave the behavior as it was in 1.4.x or
> provide a flag (defaulting either way).
>
> Terry
>
> Yonik Seeley wrote:
>
> >On 2/21/06, Terry Steichen <terry@net-frame.com> wrote:
> >
> >
> >>For example, let's say that I'm interested in docs with terms 'riot',
> >>'riots', 'rioting' and 'rioters' (which, I think, is a reasonable kind
> >>of query).  Under the previous versions of QueryParser, I could simply
> >>specify 'riot???' and capture all of those variants.
> >>
> >>
> >
> >Wouldn't the prefix query riot* fit the bill?
> >I would think that wanting 1,2, or 3 additional characters, but no
> >more would be a fairly rare case, yes?  And there might also be a rare
> >case where you want exactly 3 additional characters... the new change
> >makes both possible.
> >
> >-Yonik
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message