Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 65167 invoked from network); 27 Apr 2009 11:33:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 27 Apr 2009 11:33:43 -0000 Received: (qmail 20336 invoked by uid 500); 27 Apr 2009 11:33:41 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 20270 invoked by uid 500); 27 Apr 2009 11:33:41 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 20260 invoked by uid 99); 27 Apr 2009 11:33:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2009 11:33:41 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adam.saltiel@gmail.com designates 209.85.218.227 as permitted sender) Received: from [209.85.218.227] (HELO mail-bw0-f227.google.com) (209.85.218.227) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2009 11:33:32 +0000 Received: by bwz27 with SMTP id 27so2499679bwz.5 for ; Mon, 27 Apr 2009 04:33:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=2w917VpLekRQbLsqsgJvyKIKO120TTSM67ZVtGNfO4c=; b=D78gkH0BYywG+PpOn4yJqcWfcyJKNeXouWSnk+1QIx8+G1UVgoS7O4i1SrPtMQiZEu 0s34TYq6I2l1XF9/OyI74096dv57EyyDQgGbHjfnwuDZ15VRQJPybCK9T8PHVydF+12B pdtAYe9P+2Z8FbTUw2TVBoVTnufi2A69quP3M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=iUx3bePyegs7iGHl5YuOmDsVMhdYznRS56vHIeRwR4gFfF7NWpysgx02Vpw5RGuRwh sDqHdy6UWSNMp5s9ULFt1eDAG2eikiQRxMlpj4nGUEoSpgHNpKSgZ1vxhlrC3xJH//vs 3mNZaIH891Uj3/kKroejRwaL0BFA/mCME08G8= MIME-Version: 1.0 Received: by 10.239.179.19 with SMTP id b19mr256880hbg.2.1240831988708; Mon, 27 Apr 2009 04:33:08 -0700 (PDT) In-Reply-To: References: <7284f6e20903172309lb2bc562i1428196fa5c1e5ba@mail.gmail.com> <91315D35-754E-435F-B7B6-DCE095BD0D95@activemath.org> <7284f6e20903180034m29bed47m8bb6ce40468eb089@mail.gmail.com> Date: Mon, 27 Apr 2009 12:33:08 +0100 Message-ID: Subject: Re: lsi as indexing algorithm with lucene From: adasal To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016364d313de9b740046887b527 X-Virus-Checked: Checked by ClamAV on apache.org --0016364d313de9b740046887b527 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello all, following the link to SemanticVectors - related research there is this link:- Magnus Sahlgren. An introduction to random indexing. I would like to point out that Magnus Sahlgren has completed a PHd in this area which is both very readable and very informative. There are also other Java implementations of Random Indexing using sparse arrays, one referenced from the SICS site, sometimes it is helpful to look at alternatives as examples offered can be illuminating. Adam Saltiel 2009/4/26 Dominik Jednoralski > Hi, > > I'm the guy who has written the bachelor on this. Sorry it took a while t= o > publish it to the community, but I had to improve it before publishing. T= he > topic of the thesis was to augment the Lucene-driven search facility of t= he > Intelligent Tutoring System ActiveMath by latent semantics. Semantic > results > came from the SemanticVectors software package by Dominic Widdows > > http://code.google.com/p/semanticvectors/ > > and have been used to 'blow up' an index query in a way Mr Libbrecht, my > supervisor, described above. > > I wonder if my blog is an appropriate location for my thesis, so please > feel > free to redistribute it. > > > http://www.twelve02.de/publications/jednoralski_bachelorthesis_latent_sem= antics_for_activemath.pdf > > > Best > > Dominik Jednoralski > > > 2009/3/18 Simon Willnauer > > > Hi, > > > > On Wed, Mar 18, 2009 at 8:59 AM, Paul Libbrecht > > wrote: > > > Depending on your corpus, a semantic vector enabled search engine > > definitely > > > is more semantic than one without. > > > > > > The general approach I have with these is: > > > > > > - get a query > > > - expand each terms of the query with the fuzzification of > > semantic-vectors > > > (e.g. if requested for termA, add termB and termC with their > > > semantic-distance as a boost factor) > > > - run query get results with higher rank for termA if found, then for > > termB > > > and termC > > > > > > My student Dominik Jednoralski has written a bachelor thesis on that. > > > I'll forward the request to send you this. > > If it is possible could you post a link where everybody can reach the > > thesis of your student? > > I guess it could be interesting for a couple of people on this list > > and a benefit for your student as well. > > > > simon > > > > > > Join the semanticVectors' list where the original author also talks. > > > > > > paul > > > > > > > > > Le 18-mars-09 =C3=A0 08:34, nitin gopi a =C3=A9crit : > > > > > >> hi Paul, I am new to this field of search engine. My aim is to devel= op > > >> a semantic search engine. Initially I was trying to develop that by > > >> using LSI. But since it is patented that is why there are no many > > >> implementation attempts. I want to ask is it possible to create a > > >> search engine using lucene and semantic vector which is semantically > > >> better than lucene? > > >> > > >> On 3/18/09, Paul Libbrecht wrote: > > >>> > > >>> Nitin, > > >>> > > >>> LSI is patented so it's not been a flurry of implementation attempt= s. > > >>> However, SemanticVectors is a library that does similar approaches = to > > >>> LSA/LSI for indexing and is based on Lucene's term-vectors. > > >>> > > >>> paul > > >>> > > >>> > > >>> Le 18-mars-09 =C3=A0 07:09, nitin gopi a =C3=A9crit : > > >>> > > >>>> hi all , has any body tried to use LSI(latent semantic indexing) f= or > > >>>> indexing in lucene? > > >>> > > >>> > > >> > > >> --------------------------------------------------------------------= - > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > --0016364d313de9b740046887b527--