Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 71892 invoked from network); 5 Aug 2009 07:59:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Aug 2009 07:59:13 -0000 Received: (qmail 96692 invoked by uid 500); 5 Aug 2009 07:59:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 96607 invoked by uid 500); 5 Aug 2009 07:59:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 96597 invoked by uid 99); 5 Aug 2009 07:59:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Aug 2009 07:59:18 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of simon.willnauer@googlemail.com designates 209.85.220.225 as permitted sender) Received: from [209.85.220.225] (HELO mail-fx0-f225.google.com) (209.85.220.225) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Aug 2009 07:59:06 +0000 Received: by fxm25 with SMTP id 25so4686921fxm.5 for ; Wed, 05 Aug 2009 00:58:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=9GndsHEffVgbZOuXVNLLlGwAjv+9f1bgxWzWByfQjxA=; b=DAO/9Kpsiqa8+OUQBMsQ47SJsURlbyzKJAif2vdVLF3lK38iyZ/w19am5PFJN493Gt JSVOiRxvJCfw47ubES0g+72HoC8vt63a3yiUjgXS6HOL0J0Bw2r9B+kLSXlLtZxWAd39 psE16sNU0MvkOYQ6kPTbbQsmEoApuYgjEHZxg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type:content-transfer-encoding; b=WgbZpxpxl7ezjTyyKKUzuNrjgW+C0U4WOSk5G4x3Gu2NRKvRhDxRn1wJ8qkqFyo4pb UTeWI8pBjXCTQqBSAOGJAa4HE7OyJw/E9MSRIza0OPgfQDexXdkmDozWHnBe9qhyPguo A4zsglZ8mEjwXf49l1Jj/eAsgMmoj5rxK1044= MIME-Version: 1.0 Received: by 10.204.122.200 with SMTP id m8mr606162bkr.176.1249459125149; Wed, 05 Aug 2009 00:58:45 -0700 (PDT) Reply-To: simon.willnauer@gmail.com In-Reply-To: <20090804170013.6d9b5753@pc-4176.kl.dfki.de> References: <20090804115016.64b58dfe@pc-4176.kl.dfki.de> <180060.75297.qm@web50309.mail.re2.yahoo.com> <20090804170013.6d9b5753@pc-4176.kl.dfki.de> Date: Wed, 5 Aug 2009 09:58:45 +0200 Message-ID: Subject: Re: ParallelMultiSearcher and idf From: Simon Willnauer To: java-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hey Christian, you might wanna look at distributed solr (http://wiki.apache.org/solr/DistributedSearch) or if you haven't done so have a look at the Katta project (http://katta.sourceforge.net/documentation/how-katta-works) maybe this can help you out. About distributed IDF and Scoring have a look at this link: http://wunderwood.org/most_casual_observer/2007/04/progressive_reranking.ht= ml Simon On Tue, Aug 4, 2009 at 5:00 PM, Christian Reuschling wrote: > Hi Otis, > > thanks for the answer - I'm aware of Solr, but it seems this is - accordi= ng to > its abstraction level - too generalized for us. Solr seems to be nice in = the > case you want to use the black box, and won't be aware of 'what is under = the > hood'. > But maybe I'm totaly wrong. At least, it would be from interest how Solr > realizes its distributed search, in the case it makes something different > than using the core-Lucene ParallelMultiSearcher with RemoteSearchables. = Maybe > on this list somebody knows the answer. > > > > > > On Tue, 4 Aug 2009 07:20:23 -0700 (PDT) > Otis Gospodnetic wrote: > >> Hi Christian, >> >> You didn't mention Solr, so I'm not sure if you are aware of it. =C2=A0M= aybe Solr >> meets your needs? >> >> =C2=A0Otis >> -- >> Sematext is hiring -- http://sematext.com/about/jobs.html?mls >> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR >> >> >> >> ----- Original Message ---- >> > From: Christian Reuschling >> > To: java-user@lucene.apache.org >> > Sent: Tuesday, August 4, 2009 5:50:16 AM >> > Subject: ParallelMultiSearcher and idf >> > >> > Hello, >> > >> > when searching over multiple indices, we create one IndexReader for ea= ch >> > index, and wrap them into a MultiReader, that we use for IndexSearcher >> > creation. >> > >> > This is fine for searching multiple indices on one machine, but in the= case >> > the indices are distributed over the (intra)net, this scenario has sev= eral >> > lacks: >> > >> > - searching/scoring/sorting is 100% on the client machine, so you need= all >> > the ram and cpu power at every client. >> > - all the data necessary for scoring must go over the net - so the tra= ffic >> > =C2=A0 should be significantly higher >> > - thus, there is a lack of overall performance >> > >> > Nevertheless, creating a MultiReader and making a searcher out of it h= as one >> > advantage (at least can be an advantage depending on the scenario): Th= e >> > document freqiencies of a term will be summed up, and thus it is 100% >> > transparent for scoring whether the indices are splittet or not. >> > >> > I'm wondering whether there is the possibility to get the advantages o= f both >> > scenarios, e.g. by first summing up the query terms-related document >> > frequencies, and sending them together with the query to every >> > (remote)searcher of ParallelMultiSearcher, for scoring. >> > >> > Maybe this is exactly what ParallelMultiSearcher does, and I haven't s= een >> > it? >> > >> > >> > Thanks for clarification! >> > >> > Chris >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org