lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JensBurkhardt <>
Subject Re: combine wildcard and phrase query
Date Fri, 07 Mar 2008 15:35:39 GMT

hi again,

referring to my second issue, i've got another question. I mean, this field
thing works pretty well but:
My fields look like:
signature: LA A 100
signature: LA A 201
signature: LA A 202
signature: LA B 200
signature: LC B 300
Now i use getFields and search them.
Let's assume i'm searching for a signature like "LA B 200". If i use a
phrase query, no problem. I search all the fields and the only if the field
value and query exactly match, i get a hit.
But what if you want to use wildcards and search for something like LA A
20*. Now all the LA signatures will be in my results even if i just want two
of them. The problems are the blanks but i have no idea how this could work.

Thanks and have a nice evening


Erick Erickson wrote:
> No, as far as I know you can't combine wildcards in phrases. This would
> get extraordinarily ugly extraordinarily quickly. The way Lucene handles
> wildcards (conceputally) is to expand all the possible terms into a large
> OR
> clause. Say my index contains term1, term2, and term3. The search for
> term*
> really expands into term1 OR term2 OR term3. Now imagine the
> complexity of a phrase like "dog* cat* hors*". Now say your index
> contained
> 10 terms starting with dog, 10 with cat and 10 with hors. You'd have 1,000
> ORed phrase queries. And this is a tiny example....
> You can try various approximations, and depending upon your index size
> they
> may or may not work. For instance, you could index all the successive
> shorter
> forms. with increments of 0 (see synonym analyzer)  I.e. index horse,
> hors$
> hor$
> ho$ h$ all in the same position. Then searching for hor* becomes searching
> for
> hor$ and it all "just works". Of course this makes your index bigger.....
> About your second issue: I'm not clear what your trying to accomplish.
> It's
> no
> problem to add the same field multiple times for a document. That is, you
> can
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> as many times as you want before you add the document to the index. For
> retrieval you can call getFields ("field1") and get an array of Fields
> back,
> one
> for each call to add above. You can also set the PositionIncrementGap
> while
> indexing to separate the termposition of the first term of successive
> add()
> calls
> by, say, 100 (or whatever) if you need to worry about SpanNear or some
> such.
> This may be waaaay off base. If so, could you give a concrete example of
> what
> your inputs are and how you want to search them?
> Best
> Erick
> On Thu, Mar 6, 2008 at 7:28 AM, JensBurkhardt <>
> wrote:
>> okay, another problem occured. I have different fields with the same
>> name.
>> I
>> can't seperate them like naming them field1 field2 etc. cause while
>> indexing
>> i don't know how many fields i will need.
>> Like a book has several signature numbers i want to save them in a field
>> signature and when i search for such a number i want the search hit every
>> single field and not all fields together.
>> Right now i separate the string using an unique separator (in this case
>> just
>> $$$) so i can split the string into the numbers but i think this is kinda
>> the worst form doing it.
>> JensBurkhardt wrote:
>> >
>> > hey everybody,
>> >
>> > I'm wondering if it's possible to combine wildcards and phrase query.
>> >
>> > For example "term1 term*"
>> >
>> > I know that the documentation says "Lucene supports single and multiple
>> > character wildcard searches within single terms (not within phrase
>> > queries)" but maybe someone has had the same problem and found a
>> solution.
>> >
>> > Thanks for your help
>> >
>> > Jens Burkhardt
>> >
>> --
>> View this message in context:
>> Sent from the Lucene - Java Users mailing list archive at
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message