lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Massart" <dmass...@acm.org>
Subject Re: searching for string with a blank
Date Tue, 30 Sep 2008 14:40:51 GMT
Thanks Erick.

I see, the problem is that I use some fields tokenized and other untokenized
but that all the queries are parsed using my analyzer. Is this correct?

I suppose that removing the blanks from  learningResourceType before adding
it as an untokenized field helps (provided that I do the same operation at
query time).

Or are there more efficient ways to deal with the problem?

Cheers,

David

On Tue, Sep 30, 2008 at 3:51 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> What *analyzer* are you using for your queries? Have Luke explain
> your queries and I suspect you'll see your problem. Which if you're
> using StandardAnalyzer is probably that 'case study'
> will look like (lrt:case lrt:study). Note that the implied operator is OR
> unless
> you've changed it. But these are compared against your tokens
>
> application
>
> assessment
>
> broadcast
>
> case study
>
> course
>
> demonstration
>
> drill and practice
>
> educational game
>
> EACH OF WHICH IS A TOKEN. No indexed value matches
> "case" (the closest is "case study", but it doesn't match, since
> you indexed it UN_TOKENIZED, "case" != "case study")
> and similarly for "study".
>
> If this is completely off base, perhaps you can provide some additional
> detail.
>
> Best
> Erick
>
> On Tue, Sep 30, 2008 at 9:24 AM, David Massart <dmassart@acm.org> wrote:
>
> > Hi all,
> > Here is a new newbee question.
> >
> > I'm adding to a lucene index, documents containing a field lrt created as
> > follows:
> >
> > doc.add( new Field( "lrt", learningResourceType
> >
> > .toLowerCase(), Field.Store.NO,
> >
> > Field.Index.UN_TOKENIZED ) ) ;
> >
> >
> > where possible values for learningResourceType are:
> >
> >
> > application
> >
> > assessment
> >
> > broadcast
> >
> > case study
> >
> > course
> >
> > demonstration
> >
> > drill and practice
> >
> > educational game
> >
> >
> > In this index, searching for a string that doesn't contain a blank
> > character
> > works  fine
> >
> >
> > for example, lrt: application
> >
> >
> > but searching for a string with a blank does not return any results:
> >
> >
> > for example, lrt: case study doesn't work
> >
> >
> > Using lukeall, I can successfully query my index by escaping the blank
> with
> > a backslash
> >
> >
> > for example, lrt: case\ study
> >
> >
> > But it doesn't work when I try to do the same thing in Java
> >
> >
> > String query = "lrt: case\\ study"
> >
> >
> > doesn't return any result.
> >
> >
> > I've also try to QueryParser.escape( "case study") but it doesn't seem to
> > affect blank characters?
> >
> >
> > Thanks for your help.
> >
> >
> > David
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message