lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From javier muguruza <jmugur...@gmail.com>
Subject Re: all stop words in exact phrase get 0 hits
Date Tue, 20 Dec 2005 22:24:10 GMT
actually I ended up using your approach, also escaping the input
string. With the other approach I wasnt sure wether I could just take
the text from the token and go on etc...
its working fine in my tests so far.


On 12/20/05, Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
> : Hoss - the main caveat with this approach is that a user could select
> : a different field from any of the designated text boxes.  Putting
> : "subject:foo" in the name text box for example.
>
> sorry ... yea, i left out a step...
>
>      String words = QueryParser.escape(rawUserInputString)
>
> Generally speaking this approach has been working well for me in my
> current development as a bare bones method for converting raw user word
> lists into pieces of queries that i then compose into a larger more
> complex query structure.  the only hitch i've ever really run into is when
> people try to search for things like:  42" tv ... becuase query parser
> doesn't have any way to escape that quote, so it barfs thinking it's an
> unterminated phrase query.  but even if there was a way to escape it, i
> don't think i'd want to, beause as is it lets the (more common case) of
> quoted phrases do what (i) expect and become phrase queries.
>
> I still haven't completley decided what i want to do about the:   42" tv
> case ... but i'm very close to preprocessing the input and saying: "even
> number of quotes, leave them alone, odd number of quotes, yank them all
> out."
>
> : I think this is one of my biggest issues with QueryParser these days,
> : it exposes too much power.  It's like giving a SQL text box to your
> : database application (read-only, of course).  There are generally
> : fields not made for user querying.
>
> agreed.  If i don't take the easy way out on that 42" case, i think I may
> spend some time on a generic "UserQueryParser" that doesn't support any
> fancy syntax of the exsting QueryParser, and just knows how to build
> TermQuery and phrases queries on a specified field -- with options forhow
> to deal with uneven numbered quotes (looking at the white space arround
> the quotes for guidence in interpreting it)
>
> : > In addition to Erik's suggestion, something that i find frequently
> : > makes
> : > sense is to use the QueryParser in each case where you are dealing
> : > with a
> : > discrete "input string" -- and then combine them into a BooleanQuery
> : > (along with any other progromatically generated TermQueries you
> : > might want
> : > for other reasons)
> : >
> : > For example, it looks like your applciation is targeted at
> : > searching email
> : > right?  let's assume your application allows the user to specify the
> : > following inputs
> : >
> : >     * a text box for search words
> : >     * a text box for people's names/email
> : >     * pulldowns for picking date ranges
> : >
> : > ...and you want to look in the subject, body, and attachment fields
> : > for
> : > the input words (with differnet boosts) and in all of the header
> : > fields
> : > for the name/email input.   you can builda Date object out of the
> : > pulldown
> : > yourself, and then do something like this with the two strings...
> : >
> : >    QueryParser subP = new QueryParser("subject",SomeAnalyzer)
> : >    QueryParser bodyP = new QueryParser("body",SomeAnalyzer)
> : >    QueryParser attachP = new QueryParser("attachemnts",SomeAnalyzer)
> : >    QueryParser headP = new QueryParser
> : > ("headers",AnalyzerThatKnowsAboutEmailAddress)
> : >    Query s = subP.parse(words);
> : >    s.setBoost(2.0)
> : >    Query b = bodyP.parse(words);
> : >    Query a = attachP.parse(words);
> : >    a.setBoost(0.5)
> : >    Query h = headP.parse(person);
> : >    h.setBoost(1.5)
> : >    BooleanQuery bq = new BooleanQuery();
> : >    bq.add(s,false,false);
> : >    bq.add(b,false,false);
> : >    bq.add(a,false,false);
> : >    bq.add(h,false,false);
> : >
> : > ...and then execute your search using a filter built from the date
> : > range
> : > pulldowns.
> : >
> : >
> : > -Hoss
> : >
> : >
> : > ---------------------------------------------------------------------
> : > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : > For additional commands, e-mail: java-user-help@lucene.apache.org
> :
> :
> : ---------------------------------------------------------------------
> : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> : For additional commands, e-mail: java-user-help@lucene.apache.org
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message