Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 191EDFEE9 for ; Fri, 5 Apr 2013 11:47:31 +0000 (UTC) Received: (qmail 73517 invoked by uid 500); 5 Apr 2013 11:47:29 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 73096 invoked by uid 500); 5 Apr 2013 11:47:28 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 73082 invoked by uid 99); 5 Apr 2013 11:47:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 11:47:27 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of SRS0=MC/5ym=NY=basetechnology.com=jack@yourhostingaccount.com designates 65.254.253.94 as permitted sender) Received: from [65.254.253.94] (HELO mailout11.yourhostingaccount.com) (65.254.253.94) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 11:47:21 +0000 Received: from mailscan17.yourhostingaccount.com ([10.1.15.17] helo=mailscan17.yourhostingaccount.com) by mailout11.yourhostingaccount.com with esmtp (Exim) id 1UO56i-0003mv-De for java-user@lucene.apache.org; Fri, 05 Apr 2013 07:47:00 -0400 Received: from impout01.yourhostingaccount.com ([10.1.55.1] helo=impout01.yourhostingaccount.com) by mailscan17.yourhostingaccount.com with esmtp (Exim) id 1UO56h-0000ox-E1; Fri, 05 Apr 2013 07:46:59 -0400 Received: from authsmtp12.yourhostingaccount.com ([10.1.18.12]) by impout01.yourhostingaccount.com with NO UCE id LBmz1l0090FdXoS01BmzpU; Fri, 05 Apr 2013 07:46:59 -0400 X-Authority-Analysis: v=2.0 cv=EJGEIilC c=1 sm=1 a=UdCbmyego4VUa/xJBgcoFg==:17 a=aQzbgH187woA:10 a=umMhOXln6GkA:10 a=3jZET7lWBKwA:10 a=8nJEP1OIZ-IA:10 a=jvYhGVW7AAAA:8 a=7iBhqdWcf9MA:10 a=mV9VRH-2AAAA:8 a=MJ5y6y-atQaErJKX-qQA:9 a=wPNLvfGTeEIA:10 a=hk5CYLVG9HpqoIHU:21 a=pAvGnVqiG3wt6Yke:21 a=t1ijpx9AV50gTBtUFlM2vg==:117 X-EN-OrigOutIP: 10.1.18.12 X-EN-IMPSID: LBmz1l0090FdXoS01BmzpU Received: from 207-237-114-232.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com ([207.237.114.232] helo=JackKrupansky) by authsmtp12.yourhostingaccount.com with esmtpa (Exim) id 1UO56h-00062O-DY; Fri, 05 Apr 2013 07:46:59 -0400 Message-ID: From: "Jack Krupansky" To: , References: <515DB305.10703@cs.tcd.ie> <515E81F6.6060509@cs.tcd.ie> In-Reply-To: <515E81F6.6060509@cs.tcd.ie> Subject: Re: MLT Using a Query created in a different index Date: Fri, 5 Apr 2013 07:46:49 -0400 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3555.308 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308 X-EN-UserInfo: e0a4b55451ed9f27313ebf02e3d4348d:fc4a93e1349e680c52bdd723c0ab3ef6 X-EN-AuthUser: jack@basetechnology.com Sender: "Jack Krupansky" X-EN-OrigIP: 207.237.114.232 X-EN-OrigHost: 207-237-114-232.c3-0.nyr-ubr1.nyr.ny.cable.rcn.com X-Virus-Checked: Checked by ClamAV on apache.org In a statistical sense, for the majority of documents, yes, but you could probably find quite a few outlier examples where the results from A to B or from B to A as significantly or even completely different or even non-existent. -- Jack Krupansky -----Original Message----- From: Peter Lavin Sent: Friday, April 05, 2013 3:49 AM To: java-user@lucene.apache.org Subject: Re: MLT Using a Query created in a different index Thanks for that Jack, so it's fair to say that if both the sources and target corpus are large and diverse, then the impact of using a different index to create the query would be negligible. P. On 04/04/2013 06:49 PM, Jack Krupansky wrote: > The heart of MLT is examining the top result of a query (or maybe more > than one) and identifying the "top" terms from the top document(s) and > then simply using those top terms for a subsequent query. The term > ranking would of course depend on term frequency, and other relevancy > considerations - for the corpus of the original query. A rich query > corpus will give great results, a weak corpus will give weak results - > no matter how rich or weak the final target corpus is. OTOH, if the > target corpus really is representative on the source corpus, then > results should be either good or terrible - the selected/query document > may not have any representation in the target corpus. > > -- Jack Krupansky > > -----Original Message----- From: Peter Lavin > Sent: Thursday, April 04, 2013 1:06 PM > To: java-user@lucene.apache.org > Subject: MLT Using a Query created in a different index > > > Dear Users, > > I am doing some research where Lucene is integrated into agent > technology. Part of this work involves using an MLT query in an index > which was not created from a document in that index (i.e. the query is > created, serialised and sent to the remote agent). > > Can anyone point me towards any information on what the potential impact > of doing this would be? > > I'm assuming if both indexes have similar sets of documents, the impact > would be negligible, but what, for example would be the impact of > creating an MLT query from an index with only one or two documents for > use in an index with several (say 100+) documents, > > with thanks, > Peter > -- with best regards, Peter Lavin, PhD Candidate, CAG - Computer Architecture & Grid Research Group, Lloyd Institute, 005, Trinity College Dublin, Ireland. +353 1 8961536 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org