lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "no spam" <mrs.nos...@gmail.com>
Subject Re: default AND operator
Date Sun, 17 Sep 2006 17:44:25 GMT
Ok guys ... you're going to want to yield a big stick to me.  The problem
was my HItCollector, I wasn't actually passing it to my searcher.  Yes
somewhere in my testing I had commented out that code and it was making it
look like I wasn't getting hits.

One more question about IndexWriters (maybe I don't deserve an answer here
:-) )  .... I assume that the Analyzer used is applied and written to the
index per field.  So if I wanted one for Snowball or Stemming I'd have to
write multiple indexes?  I'm a bit confused as to how the Stemmed queries
are being matched against my StandardAnalyzer index.

Thanks for the help!
Mark

On 9/17/06, Erick Erickson <erickerickson@gmail.com> wrote:
>
> Well, I'm puzzled as well, in my simple examples I just ran, the AND
> operator behaves just fine, but that was using StandardAnalyzer. So it's
> almost certain we're not talking about the same thing <G>...
>
> So, I guess I have a couple of suggestions:
>
> 1> Try your query without the stemmingAnalyzer. Try StandardAnalyzer (or
> even WhitespaceAnalyzer) and kind of build up to the stemmer. That'll at
> least narrow the problem space.
>
> 2> You might post more details about the stemmingAnalyzer you're using.
> It's
> possible that there's some innocuous-seeming line in the creation of the
> stemmingAnalyzer you're feeding into the query parser that's producing
> this
> behavior. Parenthetically, I'm not entirely sure you're not going to get
> into a heap o' trouble using a StandardAnalyzer to create the index then
> using a stemmingAnalyzer to query it. But, as you say, that's secondary to
> the default AND question. I should also add that I don't know enough about
> stemming analyzers to put in a thimble, so this is just a theoretical
> concern.
>
> 3> Create a small, self-contained program that demonstrates this issue and
> post it here. Or, even better, a junit test <G>.
>
> I think we've exhausted the generic issues you might be having and could
> get
> a much faster resolution with a complete example to look at. "The guys"
> have
> been generous with many posters in looking at actual code......
>
> Best
> Erick.
>
> P.S. Please post whatever the resolution is, I'm pretty curious what you
> find.
>
> On 9/17/06, no spam <mrs.nospam@gmail.com> wrote:
> >
> > I am new to Lucene so I'll admit I am confused by a few things.  I'm
> using
> > an index which was built with the StandardAnalyzer.  I have verified
> this
> > by
> > using an IndexReader to read the docs back out ... Antiques is not Antiq
> > in
> > the index.   So according to this note in the Lucene docs I would assume
> a
> > Query parsed without a stemming analyzer would have matched:
> >
> > "Note: The analyzer used to create the index will be used on the terms
> and
> > phrases in the query string. So it is important to choose an analyzer
> that
> > will not interfere with the terms used in the query string."
> >
> > But it's quite the opposite, only a query parsed with the stemming
> > analyzer
> > is matching my queries.  So these are a few confusing issues which to me
> > seem a *bit* beside the point ... perhaps I'm wrong.
> >
> > HOWEVER .. I'm still confused as to why the AND operator isn't matching
> my
> > "french AND antiques" query regardless of the index.
> >
> > I will look into Luke ... thanks for your replies ... Mark
> >
> > On 9/17/06, Erick Erickson <erickerickson@gmail.com> wrote:
> > >
> > > Are you really, really sure that your *analyzer* isn't automatically
> > > lower-casing your *query* and turning "french AND antiques" into
> "french
> > > and
> > > antiques", then, as Chris says, treating "and" as a stop word?
> > >
> > > The fact that your parser transforms "antiques" into "antiqu" leads me
> > to
> > > suspect that there's a lot more going on in the parser analyzer than
> you
> > > might expect....
> > >
> > > And, in case you haven't already found it, are you sure what your
> index
> > > contains. I've found luke (google luke lucene) to be very valuable for
> > > these
> > > kinds of questions, particularly your issue about stemming etc.
> > >
> > > Best
> > > Erick
> > >
> > >
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message