lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Calculate Term Co-occurrence Matrix
Date Sat, 21 Aug 2010 12:05:10 GMT
Ahmed,

That's what that KPE (link in my previous email, below) will do for you.  It's 
not open source at this time, but that is exactly one of the things it does.  I 
think Mahout collocations stuff might work for you, too.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: ahmed algohary <algoharyalex@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Sat, August 21, 2010 7:20:03 AM
> Subject: Re: Calculate Term Co-occurrence Matrix
> 
> Thanks for all your answers!
> 
> it seems like I did not make my question  clear. I have a text corpus and I
> need to determine the pairs of words that  occur together in many documents.
> I need to do that to be able to measure the  semantic proximity between
> words. This method is expanded
> here<http://forums.searchenginewatch.com/showthread.php?t=48>.
> I hope to  find some code that given a text corpus, generate all the words
> pairs with  their probability of occurring together.
> 
> 
> On Sat, Aug 21, 2010 at 1:46  AM, Otis Gospodnetic <
> otis_gospodnetic@yahoo.com>  wrote:
> 
> > There is also a non-Mahout Key Phrase Extractor for  Collocations, SIPs, and
> > a
> > few other things:
> > http://sematext.com/products/key-phrase-extractor/index.html
> >
> >  One of the demos that uses news data is at
> > http://sematext.com/demo/kpe/index.html
> >
> > Otis
> >  ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Lucene ecosystem  search :: http://search-lucene.com/
> >
> >
> >
> > ----- Original  Message ----
> > > From: Grant Ingersoll <gsingers@apache.org>
> > > To: java-user@lucene.apache.org
> >  > Sent: Fri, August 20, 2010 8:52:17 AM
> > > Subject: Re: Calculate  Term Co-occurrence Matrix
> > >
> > > You might also be interested  in Mahout's collocations package:
> > >http://cwiki.apache.org/confluence/display/MAHOUT/Collocations
> >  >
> > > -Grant
> > > On  Aug 19, 2010, at 11:39 AM, ahmed  algohary wrote:
> > >
> > > > Hi all,
> > > >
> >  > > I need to know if there is a Lucene plug-in or a Lucene-based  API  for
> > > > calculating the term co-occurrence matrix for a  given text  corpus.
> > > >
> > > > Thanks!
> >  > >
> > > > --
> > > >  Ahmed
> >  >
> > > --------------------------
> > > Grant  Ingersoll
> > > http://www.lucidimagination.com/
> > >
> > > Search the  Lucene ecosystem using  Solr/Lucene:
> > >http://www.lucidimagination.com/search
> > >
> > >
> >  >  ---------------------------------------------------------------------
> >  > To  unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >  > For  additional commands, e-mail: java-user-help@lucene.apache.org
> >  >
> > >
> >
> >  ---------------------------------------------------------------------
> > To  unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >  For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message