Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 35120 invoked from network); 19 Mar 2004 17:24:26 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 19 Mar 2004 17:24:26 -0000 Received: (qmail 11904 invoked by uid 500); 19 Mar 2004 17:24:11 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 11861 invoked by uid 500); 19 Mar 2004 17:24:11 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 11787 invoked from network); 19 Mar 2004 17:24:10 -0000 Received: from unknown (HELO mail.gmx.net) (213.165.64.20) by daedalus.apache.org with SMTP; 19 Mar 2004 17:24:10 -0000 Received: (qmail 9008 invoked by uid 0); 19 Mar 2004 17:24:12 -0000 Received: from 193.63.235.44 by www6.gmx.net with HTTP; Fri, 19 Mar 2004 18:24:13 +0100 (MET) Date: Fri, 19 Mar 2004 18:24:13 +0100 (MET) From: "Karl Koch" To: "Lucene Users List" MIME-Version: 1.0 References: <20040319165237.70719.qmail@web12704.mail.yahoo.com> Subject: VSpace Model Index <-> Prob. Model Index - Difference? X-Priority: 3 (Normal) X-Authenticated: #21808356 Message-ID: <18735.1079717053@www6.gmx.net> X-Mailer: WWW-Mail 1.6 (Global Message Exchange) X-Flags: 0001 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hello group, coming back to the discussion about probabilistic and vector space model (which occured here some time ago), I would like to ask something related. I only know the index structure Lucene offers. Does a IR system, based on the probabilistic model (e.g. Okapi) look different from a VS model? If yes, why? I hope this questions is not too stupid. I am mainly interested because of some theoretical background... Karl > Uh, there are lots of ways to construct an inverted index. > Citeseer will give you more than you can read on this topic. > > As for Lucene, see File Formats section on the site. > > Otis > > --- Karl Koch wrote: > > If I create an standard index, what does Lucene store in this index? > > > > What should be stored in an index at least? Just a link to the file > > and > > keywords? Or also wordnumbers? What else? > > > > Does somebody know a paper which discusses this problem of "what to > > put in > > an good universal IR index" ? > > > > Cheers, > > Karl > > > > -- > > +++ NEU bei GMX und erstmalig in Deutschland: T�V-gepr�fter > > Virenschutz +++ > > 100% Virenerkennung nach Wildlist. Infos: > > http://www.gmx.net/virenschutz > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > -- +++ NEU bei GMX und erstmalig in Deutschland: T�V-gepr�fter Virenschutz +++ 100% Virenerkennung nach Wildlist. Infos: http://www.gmx.net/virenschutz --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org