lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <raghavendra.k....@barclays.com>
Subject RE: Multiple Keywords - Regular and Any Order Search
Date Sat, 12 Oct 2013 21:32:09 GMT
Ian,

Thank you very much for your valuable inputs.

The surround parser sounds very powerful and it just may be the single answer to all what
I am looking for. I have been trying hard to find an example for its implementation but haven't
been able to find one online. Could you please help?

IndexSearcher isearcher = new IndexSearcher(ireader);
SrndQuery srndQuery = QueryParser.parse("<my query>");

topFieldDocs = isearcher.search(srndQuery, filterBookDate, Integer.MAX_VALUE, sortByBookDate);

This is where I have the problem. I don't know which IndexSearcher.search to use that will
accommodate the SrndQuery. If this isn't the way to use SrndQuery, please suggest.

Regards,
Raghu


-----Original Message-----
From: Ian Lea [mailto:ian.lea@gmail.com] 
Sent: Friday, October 11, 2013 7:05 AM
To: java-user@lucene.apache.org
Subject: Re: Multiple Keywords - Regular and Any Order Search

Looks like you can achieve most of what you want by using AND rather than OR.  I think that
all the should/should not examples you give will work if you use AND on your content field.

For ordering, I suggest you look at SpanNearQuery.  That can consider order and slop, the
distance between the search terms.

You may also want to consider separate fields if you care whether "raining beautiful abc"
should match or not.  You could use MultiFieldQueryParser or build up a BooleanQuery in code,
or build a complicated string to parse to the standard query parser.  There are other query
parsers as well that might work for you e.g.
org.apache.lucene.queryparser.surround.parser.QueryParser


--
Ian.


On Thu, Oct 10, 2013 at 4:54 PM,  <raghavendra.k.rao@barclays.com> wrote:
> Hi,
>
> I have implemented Lucene to search for a single keyword across multiple fields and it
works great. I did this by concatenating all the fields into a "contents" field and searching
against this field.
>
> When I give multiple keywords against this setup, Lucene by default does an OR search,
leading to loads of duplicates. This, I understand is an expected behaviour.
>
>
> 1.       Hence the first thing that I am trying to achieve is search functionality for
multiple keywords. The most popular suggestion is to implement PhraseQuery. I will try this
out, but please let me know if you can provide an example or any suggestions.
>
>
>
> 2.       Once the multiple keywords search is implemented, I need to provide another
option to the users. They should be able to check a checkbox "Search in any order". If checked,
if the same keywords of the phrase are present "in a particular field" BUT in different order,
that should still be a match. I don't know how to implement this without forming all permutations
of the phrase and then performing an AND search. This could be very expensive in terms of
performance. Please let me know if Lucene provides a way to do this.
>
>
>
> Examples for Item 2:
>
>
>
> 3.       Field1: "RAINING HEAVILY TODAY" Field2: "BEAUTIFUL MORNING" Field3: "ABC CORPORATION
LIMITED"
>
>
>
> Search1: "RAINING HEAVILY TODAY" - Should Match
>
> Search2: "RAINING TODAY HEAVILY" - Should Match
>
> Search3: "RAIN TODAY HEAVILY" - Should NOT Match
>
> Search4: "ABC CORPORATION LIMITED" - Should Match
>
> Search5: "ABC CORP LIMITED" - Should NOT Match
>
> Search6: "ABC LIMITED CORPORATION" - Should Match
>
>
>
> I am also not sure if the "contents" field approach will work in this case. Do I need
to index the fields separately using "MultiFieldQueryParser" to achieve this?
>
>
> Sorry for the lengthy question. I would greatly appreciate any suggestions or inputs.
>
> Regards,
> Raghu
>
>
> _______________________________________________
>
> This message is for information purposes only, it is not a recommendation, advice, offer
or solicitation to buy or sell a product or service nor an official confirmation of any transaction.
It is directed at persons who are professionals and is not intended for retail customer use.
Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding
market commentary from Barclays Sales and/or Trading, who are active market participants;
and in respect of Barclays Research, including disclosures relating to specific issuers, please
see http://publicresearch.barclays.com.
>
> _______________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

_______________________________________________

This message is for information purposes only, it is not a recommendation, advice, offer or
solicitation to buy or sell a product or service nor an official confirmation of any transaction.
It is directed at persons who are professionals and is not intended for retail customer use.
Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer.

For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding
market commentary from Barclays Sales and/or Trading, who are active market participants;
and in respect of Barclays Research, including disclosures relating to specific issuers, please
see http://publicresearch.barclays.com.

_______________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message