lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: TermRangeQuery with multiple words
Date Mon, 20 Aug 2012 13:32:55 GMT
Jochen


No, I don't think that Lucene can make a String range query on
multiple terms.  For your Microsoft example you could build a query
with Microsoft as required TermQuery and a required TermRangeQuery
from Belgium to Spain but that would fall apart with multiword company
or region names.

It feels like there should be a moderately simple answer, but I can't
spot it.  Maybe someone else can.


--
Ian.


On Mon, Aug 20, 2012 at 2:13 PM, Jochen Hebbrecht
<jochenhebbrecht@gmail.com> wrote:
> Hi Ian,
>
> Thanks for your answer!
> Well, my example might have been not so clear. Here's a better example:
>
> Doc 01: TEST: "Microsoft Belgium"
> Doc 02: TEST: "Apple"
> Doc 03: TEST: "Microsoft France"
> Doc 04: TEST: "Evian"
> Doc 05: TEST: "Nokia"
> Doc 06: TEST: "Novotel"
> Doc 07: TEST: "Microsoft Germany"
> Doc 08: TEST: "Microsoft Spain"
>
>
> Now, I want to search for all documents which have the field TEST going
> from "Microsoft Belgium" to "Microsoft Spain".
> The problem is, I cannot search on multiple terms in a range :-( ...
>
> What I can do, is to search from "Microsoft" to "Microsoft", this one
> works. But not the one stated above ...
> So the question is: can Lucene make a String range query on multiple terms?
>
> Kind regards,
> Jochen
>
>
> 2012/8/20 Ian Lea <ian.lea@gmail.com>
>
>> This won't work with TermRangeQuery because neither "test 1" not "test
>> 3" are terms.  "test" will be a term, output by the analyzer.  You'll
>> be able to see the indexed terms in Luke.
>>
>> Sounds very flaky anyway - you'd get "term 10 xxx" and "term 100 xxx"
>> as well as "term 1" and "term 2".  If your TEST values are that
>> predictable you could split them up and index the number separately,
>> maybe using NumericField and build a query using NumericRangeQuery.
>>
>> RegexQuery in contrib-queries might also be worth a look.
>>
>>
>> --
>> Ian.
>>
>> On Mon, Aug 20, 2012 at 12:59 PM, Jochen Hebbrecht
>> <jochenhebbrecht@gmail.com> wrote:
>> > Hi,
>> >
>> > I have 5 documents. Each document has a field TEST. Total structure is
>> > looking like this:
>> >
>> > Doc 01: TEST: "test 1 string"
>> > Doc 02: TEST: "test 2 string"
>> > Doc 03: TEST: "test 3 string"
>> > Doc 04: TEST: "test 4 string"
>> > Doc 05: TEST: "test 5 string"
>> >
>> > These fields are indexed as Index.Analyzed with the StandardAnalyzer.
>> > With Luke, I can see for example:
>> >
>> > Document: Doc 01
>> > Field: TEST
>> > Terms: test, 1, string
>> >
>> > But now I want to make rangesearch as:
>> >
>> > <<
>> > new TermRangeQuery("TEST", "test 1", "test 3", true, true);
>> >>>
>> >
>> > ... to pickup the first 3 documents. Unfortunately, this doesn't seem to
>> > work for multiple words.
>> >
>> > Can somebody help me correcting my TermRangeQuery?
>> >
>> > Thanks!
>> > Jochen
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message