lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From JensBurkhardt <jensburkha...@web.de>
Subject Re: combine wildcard and phrase query
Date Thu, 06 Mar 2008 15:36:02 GMT

okay thanks. the first thing was what i've expected :-) . well about my
second issue,
i was totally wrong. Just forget what i've said! I had in mind that if i
have several fields with the same name
these fields are connected to a big string.
Now as i read your message i remember that this behavior is just with
boost-values ;-) and not with field-values. hell no ... 
thanks for your answers :-). You saved a lot of time ;-)

best regards
Jens


Erick Erickson wrote:
> 
> No, as far as I know you can't combine wildcards in phrases. This would
> get extraordinarily ugly extraordinarily quickly. The way Lucene handles
> wildcards (conceputally) is to expand all the possible terms into a large
> OR
> clause. Say my index contains term1, term2, and term3. The search for
> term*
> really expands into term1 OR term2 OR term3. Now imagine the
> complexity of a phrase like "dog* cat* hors*". Now say your index
> contained
> 10 terms starting with dog, 10 with cat and 10 with hors. You'd have 1,000
> ORed phrase queries. And this is a tiny example....
> 
> You can try various approximations, and depending upon your index size
> they
> may or may not work. For instance, you could index all the successive
> shorter
> forms. with increments of 0 (see synonym analyzer)  I.e. index horse,
> hors$
> hor$
> ho$ h$ all in the same position. Then searching for hor* becomes searching
> for
> hor$ and it all "just works". Of course this makes your index bigger.....
> 
> About your second issue: I'm not clear what your trying to accomplish.
> It's
> no
> problem to add the same field multiple times for a document. That is, you
> can
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> doc.add(new field("field1", ......)
> as many times as you want before you add the document to the index. For
> retrieval you can call getFields ("field1") and get an array of Fields
> back,
> one
> for each call to add above. You can also set the PositionIncrementGap
> while
> indexing to separate the termposition of the first term of successive
> add()
> calls
> by, say, 100 (or whatever) if you need to worry about SpanNear or some
> such.
> 
> This may be waaaay off base. If so, could you give a concrete example of
> what
> your inputs are and how you want to search them?
> 
> Best
> Erick
> 
> On Thu, Mar 6, 2008 at 7:28 AM, JensBurkhardt <jensburkhardt@web.de>
> wrote:
> 
>>
>> okay, another problem occured. I have different fields with the same
>> name.
>> I
>> can't seperate them like naming them field1 field2 etc. cause while
>> indexing
>> i don't know how many fields i will need.
>> Like a book has several signature numbers i want to save them in a field
>> signature and when i search for such a number i want the search hit every
>> single field and not all fields together.
>> Right now i separate the string using an unique separator (in this case
>> just
>> $$$) so i can split the string into the numbers but i think this is kinda
>> the worst form doing it.
>>
>>
>>
>>
>> JensBurkhardt wrote:
>> >
>> > hey everybody,
>> >
>> > I'm wondering if it's possible to combine wildcards and phrase query.
>> >
>> > For example "term1 term*"
>> >
>> > I know that the documentation says "Lucene supports single and multiple
>> > character wildcard searches within single terms (not within phrase
>> > queries)" but maybe someone has had the same problem and found a
>> solution.
>> >
>> > Thanks for your help
>> >
>> > Jens Burkhardt
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/combine-wildcard-and-phrase-query-tp15870647p15872169.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/combine-wildcard-and-phrase-query-tp15870647p15874560.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message