lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jochen Hebbrecht <jochenhebbre...@gmail.com>
Subject Re: TermRangeQuery with multiple words
Date Mon, 20 Aug 2012 13:35:57 GMT
Hehe Ian, our mails just crossed. I was thinking in the same way! :-).
Thanks for your reply!

2012/8/20 Ian Lea <ian.lea@gmail.com>

> Jochen
>
>
> No, I don't think that Lucene can make a String range query on
> multiple terms.  For your Microsoft example you could build a query
> with Microsoft as required TermQuery and a required TermRangeQuery
> from Belgium to Spain but that would fall apart with multiword company
> or region names.
>
> It feels like there should be a moderately simple answer, but I can't
> spot it.  Maybe someone else can.
>
>
> --
> Ian.
>
>
> On Mon, Aug 20, 2012 at 2:13 PM, Jochen Hebbrecht
> <jochenhebbrecht@gmail.com> wrote:
> > Hi Ian,
> >
> > Thanks for your answer!
> > Well, my example might have been not so clear. Here's a better example:
> >
> > Doc 01: TEST: "Microsoft Belgium"
> > Doc 02: TEST: "Apple"
> > Doc 03: TEST: "Microsoft France"
> > Doc 04: TEST: "Evian"
> > Doc 05: TEST: "Nokia"
> > Doc 06: TEST: "Novotel"
> > Doc 07: TEST: "Microsoft Germany"
> > Doc 08: TEST: "Microsoft Spain"
> >
> >
> > Now, I want to search for all documents which have the field TEST going
> > from "Microsoft Belgium" to "Microsoft Spain".
> > The problem is, I cannot search on multiple terms in a range :-( ...
> >
> > What I can do, is to search from "Microsoft" to "Microsoft", this one
> > works. But not the one stated above ...
> > So the question is: can Lucene make a String range query on multiple
> terms?
> >
> > Kind regards,
> > Jochen
> >
> >
> > 2012/8/20 Ian Lea <ian.lea@gmail.com>
> >
> >> This won't work with TermRangeQuery because neither "test 1" not "test
> >> 3" are terms.  "test" will be a term, output by the analyzer.  You'll
> >> be able to see the indexed terms in Luke.
> >>
> >> Sounds very flaky anyway - you'd get "term 10 xxx" and "term 100 xxx"
> >> as well as "term 1" and "term 2".  If your TEST values are that
> >> predictable you could split them up and index the number separately,
> >> maybe using NumericField and build a query using NumericRangeQuery.
> >>
> >> RegexQuery in contrib-queries might also be worth a look.
> >>
> >>
> >> --
> >> Ian.
> >>
> >> On Mon, Aug 20, 2012 at 12:59 PM, Jochen Hebbrecht
> >> <jochenhebbrecht@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I have 5 documents. Each document has a field TEST. Total structure is
> >> > looking like this:
> >> >
> >> > Doc 01: TEST: "test 1 string"
> >> > Doc 02: TEST: "test 2 string"
> >> > Doc 03: TEST: "test 3 string"
> >> > Doc 04: TEST: "test 4 string"
> >> > Doc 05: TEST: "test 5 string"
> >> >
> >> > These fields are indexed as Index.Analyzed with the StandardAnalyzer.
> >> > With Luke, I can see for example:
> >> >
> >> > Document: Doc 01
> >> > Field: TEST
> >> > Terms: test, 1, string
> >> >
> >> > But now I want to make rangesearch as:
> >> >
> >> > <<
> >> > new TermRangeQuery("TEST", "test 1", "test 3", true, true);
> >> >>>
> >> >
> >> > ... to pickup the first 3 documents. Unfortunately, this doesn't seem
> to
> >> > work for multiple words.
> >> >
> >> > Can somebody help me correcting my TermRangeQuery?
> >> >
> >> > Thanks!
> >> > Jochen
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message