Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 8759 invoked from network); 21 Feb 2007 14:03:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Feb 2007 14:03:47 -0000 Received: (qmail 27909 invoked by uid 500); 21 Feb 2007 14:03:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 27875 invoked by uid 500); 21 Feb 2007 14:03:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 27849 invoked by uid 99); 21 Feb 2007 14:03:45 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Feb 2007 06:03:44 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of Sachin.Kainth@atkinsglobal.com designates 217.68.146.190 as permitted sender) Received: from [217.68.146.190] (HELO cluster-b.mailcontrol.com) (217.68.146.190) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Feb 2007 06:03:28 -0800 Received: from SGBD012100.wsatkins.com ([193.117.24.24]) by rly35b.srv.mailcontrol.com (MailControl) with SMTP id l1LE32hV022302 for ; Wed, 21 Feb 2007 14:03:02 GMT Received: From SGBD012103.wsatkins.com ([10.202.26.17]) by SGBD012100.wsatkins.com (WebShield SMTP v4.5 MR2); id 1172066562534; Wed, 21 Feb 2007 14:02:42 +0000 Received: from SGBLOW2101.wsatkins.com ([10.22.33.30]) by SGBD012103.wsatkins.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 21 Feb 2007 14:02:42 +0000 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Search for a term in all fields Date: Wed, 21 Feb 2007 14:02:40 -0000 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Search for a term in all fields Thread-Index: AcdVuQT8sTI3TBQNRuSY4+vhI+dArwAAD/HA References: <359a92830702210505g14d8383eq28834b55036949a8@mail.gmail.com> From: "Kainth, Sachin" To: X-OriginalArrivalTime: 21 Feb 2007 14:02:42.0030 (UTC) FILETIME=[F412C0E0:01C755C0] X-Scanned-By: MailControl A-07-06-90 (www.mailcontrol.com) on 10.66.0.145 X-Virus-Checked: Checked by ClamAV on apache.org Sorry I didn't make myself clear at all. Remember you said that it is possible to do this: > Sure. Convert your simple queries into span queries (which are also=20 > relatively simple). Then, when you index everything in the "all"=20 > field, subclass your analyzer to return a large PositionIncrementGap. > Explaining how this works with words is awkward, so.... > > doc.add("all", "one two three"); > doc.add("all", "four five six"); > doc.add("all", "seven eight nine"); > index the document. > > Assume you've implemented an analyzer that returns 1000 for=20 > getPositionIncrementGap. > > Now, the term offsets in the single document will be one - 0 two - 1=20 > three - 2 four 1003 five 1004 six 1005 seven 2006 eight 2007 nine 2008 > > Now, if you use SpanNearQuery with a slop of 900 (i.e. "one nine"~900) > you won't get a match because the "distance" between one and nine is=20 > more than 900. But "one three"~900 will match. > > It's possible to transform any query into a set of span queries, See=20 > the thread "Multiword Highlighting" that Mark Miller and I were=20 > exchanging ideas on recently. Be aware that the code we were talking=20 > about has to have a modification when used on a "regular" index where=20 > it pays attention to the document that each sub-clause comes. The=20 > code, as written, assumes you're using a MemoryIndex for one and only=20 > one document, so unless you need complex queries, I'd just think about > rewriting simple queries with ANDs as a SpanNearQuery. Well, what I meant was instead of using a gap of 1000 what I was thinking is could we not replace that gap of a 1000 characters with a ~. Then, if this is possible what I was wondering is whether there is a way of performing searches using the ~. Cheers Sachin =20 -----Original Message----- From: Erick Erickson [mailto:erickerickson@gmail.com]=20 Sent: 21 February 2007 13:05 To: java-user@lucene.apache.org Subject: Re: Search for a term in all fields I don't see what you're getting at. There are only two forms of a query term,,,, field:value value And the second is really the first with the default field you specified in the parser implied. So just think of all terms you specify in a query as field:term. Having some "special character" in the index doesn't help you because you still have to specify the field. And your two choices are still either a BooleanQuery that mentions all fields or indexing the data into a single field. Best Erick On 2/21/07, Kainth, Sachin wrote: > > Well, here's my current thoughts on acheiveing this. Instead of=20 > putting a 1000 space gap between elements of the 1ll field could I not > use a character that isn't used in the data such as ~ and then somehow > (don't know how) use that to search all fields? > > -----Original Message----- > From: Chris Hostetter [mailto:hossman_lucene@fucit.org] > Sent: 20 February 2007 18:30 > To: java-user@lucene.apache.org > Subject: Re: Search for a term in all fields > > > The information Erick gave you when you asked this question yesterday=20 > is all very accurate -- the one addition i would make is that you=20 > don't need SpanNear queries to take advantage of positionINcrimentGap=20 > -- PhraseQueries do that to. > > Consolidating your fields into a single "all" field, or constructing a > BoolenQuery across all of your existing fields are really the two main > options -- each with their tradeoffs. > > http://www.nabble.com/Search-in-all-fields-tf3254569.html > > : Date: Tue, 20 Feb 2007 12:29:25 -0000 > : From: "Kainth, Sachin" > : Reply-To: java-user@lucene.apache.org > : To: java-user@lucene.apache.org > : Subject: Search for a term in all fields > : > : Hi all, > : > : How do I search for a term in all fields of a document? > : > : Cheers > : > : Sachin > : > : > : This email and any attached files are confidential and copyright=20 > protected. If you are not the addressee, any dissemination of this=20 > communication is strictly prohibited. Unless otherwise expressly=20 > agreed in writing, nothing stated in this communication shall be=20 > legally binding. > : > : The ultimate parent company of the Atkins Group is WS Atkins plc. > Registered in England No. 1885586. Registered Office Woodcote Grove,=20 > Ashley Road, Epsom, Surrey KT18 5BW. > : > : Consider the environment. Please don't print this e-mail unless you=20 > really need to. > : > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > This message has been scanned for viruses by MailControl - (see > http://bluepages.wsatkins.co.uk/?6875772) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org