lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kainth, Sachin" <Sachin.Kai...@atkinsglobal.com>
Subject RE: Search for a term in all fields
Date Wed, 21 Feb 2007 14:02:40 GMT
Sorry I didn't make myself clear at all.  Remember you said that it is
possible to do this:

> Sure. Convert your simple queries into span queries (which are also 
> relatively simple). Then, when you index everything in the "all" 
> field, subclass your analyzer to return a large PositionIncrementGap.
> Explaining how this works with words is awkward, so....
>
> doc.add("all", "one two three");
> doc.add("all", "four five six");
> doc.add("all", "seven eight nine");
> index the document.
>
> Assume you've implemented an analyzer that returns 1000 for 
> getPositionIncrementGap.
>
> Now, the term offsets in the single document will be one - 0 two - 1 
> three - 2 four 1003 five 1004 six 1005 seven 2006 eight 2007 nine 2008
>
> Now, if you use SpanNearQuery with a slop of 900 (i.e. "one nine"~900)

> you won't get a match because the "distance" between one and nine is 
> more than 900. But "one three"~900 will match.
>
> It's possible to transform any query into a set of span queries, See 
> the thread "Multiword Highlighting" that Mark Miller and I were 
> exchanging ideas on recently. Be aware that the code we were talking 
> about has to have a modification when used on a "regular" index where 
> it pays attention to the document that each sub-clause comes. The 
> code, as written, assumes you're using a MemoryIndex for one and only 
> one document, so unless you need complex queries, I'd just think about

> rewriting simple queries with ANDs as a SpanNearQuery.

Well, what I meant was instead of using a gap of 1000 what I was
thinking is could we not replace that gap of a 1000 characters with a ~.
Then, if this is possible what I was wondering is whether there is a way
of performing searches using the ~.

Cheers

Sachin
 

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: 21 February 2007 13:05
To: java-user@lucene.apache.org
Subject: Re: Search for a term in all fields

I don't see what you're getting at. There are only two forms of a query
term,,,, field:value value

And the second is really the first with the default field you specified
in the parser implied. So just think of all terms you specify in a query
as field:term.

Having some "special character" in the index doesn't help you because
you still have to specify the field. And your two choices are still
either a BooleanQuery that mentions all fields or indexing the data into
a single field.

Best
Erick



On 2/21/07, Kainth, Sachin <Sachin.Kainth@atkinsglobal.com> wrote:
>
> Well, here's my current thoughts on acheiveing this.  Instead of 
> putting a 1000 space gap between elements of the 1ll field could I not

> use a character that isn't used in the data such as ~ and then somehow

> (don't know how) use that to search all fields?
>
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: 20 February 2007 18:30
> To: java-user@lucene.apache.org
> Subject: Re: Search for a term in all fields
>
>
> The information Erick gave you when you asked this question yesterday 
> is all very accurate -- the one addition i would make is that you 
> don't need SpanNear queries to take advantage of positionINcrimentGap 
> -- PhraseQueries do that to.
>
> Consolidating your fields into a single "all" field, or constructing a

> BoolenQuery across all of your existing fields are really the two main

> options -- each with their tradeoffs.
>
> http://www.nabble.com/Search-in-all-fields-tf3254569.html
>
> : Date: Tue, 20 Feb 2007 12:29:25 -0000
> : From: "Kainth, Sachin" <Sachin.Kainth@atkinsglobal.com>
> : Reply-To: java-user@lucene.apache.org
> : To: java-user@lucene.apache.org
> : Subject: Search for a term in all fields
> :
> : Hi all,
> :
> : How do I search for a term in all fields of a document?
> :
> : Cheers
> :
> : Sachin
> :
> :
> : This email and any attached files are confidential and copyright 
> protected. If you are not the addressee, any dissemination of this 
> communication is strictly prohibited. Unless otherwise expressly 
> agreed in writing, nothing stated in this communication shall be 
> legally binding.
> :
> : The ultimate parent company of the Atkins Group is WS Atkins plc.
> Registered in England No. 1885586.  Registered Office Woodcote Grove, 
> Ashley Road, Epsom, Surrey KT18 5BW.
> :
> : Consider the environment. Please don't print this e-mail unless you 
> really need to.
> :
>
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> This message has been scanned for viruses by MailControl - (see
> http://bluepages.wsatkins.co.uk/?6875772)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message