lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Newbie: Luke and fields
Date Wed, 09 Sep 2009 13:27:56 GMT
It's all in the analyzers. Depending upon which analyzer you use manythings
happen to the input stream. Casing is one example, but that's just
the simplest. Which is why it's so important to use the same analyzer
when indexing and querying unless you have a *very* good reason not to.

I'd really advise looking at the Analyzer docs to see what transformations
they make. You'll see they're composed of Filters that perform various
transformations..

Best
Erick

On Fri, Sep 4, 2009 at 10:07 PM, Ian Vink <ianvink@gmail.com> wrote:

> Case?
>
> Hmm. I thought it was case intensive.
>
> I will re-index and revert all to lower case and see. The language is
> stored
> as "English" not "english"
>
> ian
>
> P.S. Here's what I built with a basic understanding of Lucene.
> http://BahaiResearch.com it's open source, ad-free. It allows people in 20
> languages search the texts of any religion. Promotes understanding among
> the
> different peoples of the world. Chinese Buddhists can better understand say
> Spanish Baha'is or Dutch Jews.
>
>
> On Fri, Sep 4, 2009 at 10:02 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > Hmmmmm. Let's see the queries, and query.toString() may give
> > you some clues. I *suspect* that you really didn't index language.
> > Did you, perhaps, not re-index all your docs? Or use an analyzer
> > that didn't fold case when searching but did when searching (or
> > vice-versa)?
> >
> > It's *possible* that you've got some kind of case problem. I've also
> > done really silly things like stared at the field for hours and never
> > noticed that it was "langauge".
> >
> > Two things would help:
> > 1> show us the index code you actually use.
> > 2> make an index with, say, two documents and attach it to a post. Maybe
> >     someone with eyes less tired than yours will spot the issue.
> >
> > Because you're right. This *should* work just fine, so there's probably
> > something that'll make you slap your forehead when you figure it out.
> > I speak from experience here, I've given myself lumps on my forehead
> > when someone pointed out *perfectly obvious* problems that I couldn't
> > find for hours<G>.
> >
> > Best
> > Erick
> >
> > On Fri, Sep 4, 2009 at 7:55 PM, Ian Vink <ianvink@gmail.com> wrote:
> >
> > > I have created an index and each document has a contents field and a
> > > language field.
> > >
> > > contents has the flags: Indexed Tokenized Stored Vector
> > > language has the flags: Indexed Stored
> > >
> > > In luke I can search contents fine, but when I try to search the field
> > > language, I never ever get results.
> > >
> > > Every document has a language field populated which I can see in Luke's
> > > document browser.
> > >
> > > I have tried language:english and even made language the default field.
> > >
> > > I gotta be missing something simple....
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message